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HUMAN SIGNAL PEPTIDE-CONTAINING PROTEINS 



FIELD OF THE INVENTION 



This invention relates to nucleic acid and amino acid sequences of human signal 
peptide-containing proteins and to the use of these sequences in the diagnosis, treatment, and 
prevention of cancer and immunological disorders. 



Protein transport is an essential process for all living cells. Transport of an individual 
protein usually occurs via an amino-terminal signal sequence which directs, or targets, the 
protein from its ribosomal assembly site to a particular cellular or extracellular location. 
10 Transport may involve any combination of several of the following steps: contact with a 
chaperone, unfolding, interaction with a receptor and/or a pore complex, addition of energy, 
and refolding. Moreover, an extracellular protein may be produced as an inactive precursor. 
Once the precursor has been exported, removal of the signal sequence by a signal peptidase 
and posttranslational processing (e.g., glycosylation or phosphorylation) activates the protein. 
15 Signal sequences are common to receptors, matrix molecules (e.g., adhesion, cadherin, 

extracellular matrix, integrin, and selectin), cytokines, hormones, growth and differentiation 
factors, neuropeptides, vasomediators, phosphokinases, phosphatases, phospholipases, 
phosphodiesterases, G and Ras-related proteins, ion channels, transporters/pumps, proteases, 
and transcription factors. 

20 G-protein coupled receptors (GPCRs) are a superfamily of integral membrane proteins 

which transduce extracellular signals. GPCRs include receptors for biogenic amines, e.g., 
dopamine, epinephrine, histamine, glutamate (metabotropic effect), acetylcholine (muscarinic 
effect), and serotonin; for lipid mediators of inflammation such as prostaglandins, platelet 
activating factor, and leukotrienes; for peptide hormones such as calcitonin, C5a 

25 anaphylatoxin, follicle stimulating hormone, gonadotropin releasing hormone, neurokinin, 
oxytocin, and thrombin; and for sensory signal mediators, e.g., retinal photopigments and 
olfactory stimulatory molecules. 
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The structure of these highly-conserved receptors consists of seven hydrophobic 
transmembrane regions, cysteine disulfide bridges between the second and third extracellular 
loops, an extracellular N^erminus, and a cytoplasmic C-terminus. Three extracellular loops 
alternate with three intracellular loops to link the seven transmembrane regions. The 
5 N-terminus interacts with ligands, the disulfide bridge interacts with agonists and antagonists, 
and the large third intracellular loop interacts with G proteins to activate second messengers 
such as cyclic AMP (cAMP), phospholipase C, inositol triphosphate, or ion channel proteins. 
The most conserved parts of these proteins are the transmembrane regions and the first two 
cytoplasmic loops. A conserved, acidic- Arg-aromatic triplet present in the second 
10 cytoplasmic loop may interact with the G proteins. The consensus pattern, 

[GSTALIVMYWC]-[GSTANCPDE]-{EDPKRH}-x(2)4LIVMNQGA]-x(2)4LIVMFT]-[GS 
j] TANC]-[LIVMFYWSTAC]-[DENH]- R-[FYWCSH]-x(2)-[LIVM] is characteristic of most 
If proteins belonging to this superfamily. (Watson, S. and Arkinstall, S. (1994) The G-protein 
? Linked Receptor Facts Book . Academic Press, San Diego, CA, pp. 2-6; and Bolander, F.F. 
ft i5 (1994) Molecular Endocrinology . Academic Press, San Diego, CA, pp. 8-19.) 
\ x Tetraspanins are a superfamily of membrane proteins which facilitate the formation 

^ and stability of cell-surface signaling complexes containing lineage-specific proteins, 
* integrins, and other tetraspanins. They are involved in cell activation, proliferation (including 
j cancer), differentiation, adhesion, and motility. These proteins cross the membrane four 
20 times, have conserved intracellular N- and C-termini and an extracellular, non-conserved 
hydrophilic domain. Three highly conserved polar amino acids are located in the 
transmembrane domains (TM), an asparagine in TM1 and a glutamate or glutamine in TM3 
and TM4. Two to three conserved charged residues, including a glutamic acid residue, are 
present in the cytoplasmic loop between TM2 and TM3. The extracellular loop between 
25 TM3 and TM4 contains four conserved cysteine residues: two in a conserved CCG motif 
located about 50 residues C-terminal to TM3; one, often preceded by glycine, 1 1 residues N- 
terminal to TM4; and one in the extracellular loop may be found in a PXSC motif. 
Tetraspanins include, e.g., platelet and endothelial cell membrane proteins, leukocyte surface 
proteins, tissue specific and tumorous antigens, and the retinitis pigmentosa-associated gene 



30 peripherin. (Maecker, H.T. et al. (1997) FASEB J. 1 1 :428-442.) Matrix proteins (Mps) 
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function in formation, growth, remodeling and maintenance of tissues and as important 
mediators and regulators of the inflammatory response. The expression and balance of MPs 
may be perturbed by biochemical changes that result from congenital, epigenetic, or 
infectious diseases. In addition, MPs affect leukocyte migration, proliferation, 
5 differentiation, and activation in immune response. 

MPs encompass a variety of proteins and their functions. Extracellular matrix (ECM) 
proteins are multidomain proteins that play an important role in the diverse functions of the 
ECM. ECM proteins are frequently characterized by the presence of one or more domains 
which may include collagen-like domains, EGF-like domains, immunoglobulin-like domains, 

10 fibronectin-like domains, vWFA-like modules. (Ayad, S. et al. (1994) The Extracellular 
Matrix Facts Book Academic Press, San Diego, CA, pp. 2-16.) Cell adhesion molecules 
(CAMs) have been shown to stimulate axonal growth through homophilic and/or heterophilic 
interactions with other molecules. In addition, interactions between adhesion molecules and 
their receptors can potentiate the effects of growth factors upon cell biochemistry via shared 

15 signaling pathways. (Ruoslahti, E. (1997) Kidney Int. 51:1413-1417.) Cadherins comprise a 
family of calcium-dependant glycoproteins that function in mediating cell-cell adhesion in 
solid tissues of multicellular organisms. Integrins are ubiquitous transmembrane adhesion 
molecules that link cells to the ECM by interacting with the cytoskeleton. Integrins also 
function as signal transduction receptors and stimulate changes in intracellular calcium levels 

20 and protein kinase activity. (Sjaastad, M.D. and Nelson, W J. (1997) BioEssays 19:47-55.) 
Lectins are proteins characterized by their ability to bind carbohydrates on cell 
membranes by means of discrete, modular carbohydrate recognition domains, CRDs, 
(Kishore, U. et al. (1997) Matrix Biol. 15:583-592.) Certain cytokines and membrane- 
spanning proteins have CRDs which may enhance interactions with extracellular or 

25 intracellular ligands, with proteins in secretory pathways, or with molecules in signal 

transduction pathways. The lipocalin superfamily constitutes a phylogenetically conserved 
group of more than forty proteins that function by binding to and transporting a variety of 
physiologically important ligands. Members of this family function as carriers of retinoids, 
odorants, chromophores, pheromones, and sterols, and a subset of these proteins may be 

30 multifunctional, serving as either a biosynthetic enzyme or as a specific enzyme inhibitor. 
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(Tanaka, T. et al. (1997) J. Biol. Chem. 272:15789-15795; and van't Hof, W. et al. (1997) J. 
Biol. Chem. 272: 1 837-1 841 .) Selectins are a family of calcium ion-dependent lectins 
expressed on inflamed vascular endothelium and the surface of some leukocytes. They 
mediate rolling movement and adhesive contacts between blood cells and blood vessel walls. 
5 The structure of the selectins and their ligands supports the type of bond formation and 

dissociation that allows a cell to roll under conditions of flow. (Rossiter, H. et al. (1997) Mol. 
Med. Today 3:214-222.) 

Protein kinases regulate many different cell proliferation, differentiation, and 
signaling processes by adding phosphate groups to proteins. Reversible protein 
10 phosphorylation is a key strategy for controlling protein functional activity in eukaryotic 
13 cells. The high energy phosphate which drives this activation is generally transferred from 
Pi adenosine triphosphate molecules (ATP) to a particular protein by protein kinases and 
^ removed from that protein by protein phosphatases. Phosphorylation occurs in response to 
J" extracellular signals, ceil cycle checkpoints, and environmental or nutritional stresses, 
"q 15 Protein kinases may be roughly divided into two groups; protein tyrosine kinases (PTKs) 
T . which phosphorylate tyrosine residues, and serine/threonine kinases (STKs) which 
* ^ phosphorylate serine or threonine residues. A few protein kinases have dual specificity. A 
^ majority of kinases contain a similar 250-300 amino acid catalytic domain which can be 
^ further divided into eleven subdomains. The N-terminal domain, which contains subdomains 
20 I to IV, generally folds into a two-lobed structure which binds and orients the ATP (or GTP) 
donor molecule. The larger C terminal domain, which contains subdomains VIA to XI, binds 
the protein substrate and carries out the transfer of the gamma phosphate from ATP to the 
hydroxyl group of the target amino acid residue. Subdomain V links the two domains. Each 
of the 1 1 subdomains contain specific residues and motifs that are characteristic and are 
25 highly conserved. (Hardie, G. and Hanks, S. (1995) The Protein Kinase Facts Book . Vol I, 
pp. 7-47, Academic Press, San Diego, CA.) 

Protein phosphatases remove phosphate groups from molecules previously modified 
by protein kinases thus participating in cell signaling, proliferation, differentiation, contacts, 
and oncogenesis. Protein phosphorylation is a key strategy used to control protein functional 
30 activity in eukaryotic cells. The high energy phosphate is transferred from ATP to a protein 
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by protein kinases and removed by protein phosphatases. There appear to be three, 
evolutionarily-distinct protein phosphatase gene families: protein phosphatases (PPs); protein 
tyrosine phosphatases (PTPs); and acid/alkaline phosphatases (APs). PPs dephosphorylate 
phosphoserine/threonine residues and are an important regulator of many cAMP mediated, 
5 hormone responses in cells. PTPs reverse the effects of protein tyrosine kinases and therefore 
play a significant role in cell cycle and cell signaling processes. Although APs 
dephosphorylate substrates in vitro , their role in vivo is not well known. (Carbonneau, H. and 
Tonks, N.K. (1992) Annu. Rev. Cell BioL 8:463-493,) 

Protein phosphatase inhibitors control the activities of specific phosphatases. A 
10 specific inhibitor of PP-I, 1-1, has been identified that when phosphorylated by cAMP- 
^ dependent protein kinase (PKA) specifically binds to PP-I and inhibits its activity. Since 
m pp_i i s dephosphoryles many of the proteins phosphorylated by PKA, activation of 1-1 by 
D PKA serves to amplify the effects of PKA and the many cAMP-dependent responses 

mediated by PKA. In addition, since PP-I also dephosphorylates many phosphoproteins that 
;"!;15 are not phosphorylated by PKA, 1-1 activation serves to exert cAMP control over other 
s protein phosphorylations. IjPP2 A is a specific and potent inhibitor of PP-IIA. (Li, M. etal. 
fh (1996) Biochemistry 35:6998-7002.) Since PP-IIA is the main phosphatase responsible for 
' y reversing the phosphorylations of serine/threonine kinases, IJPP2A has broad effects in 
: 3 controlling protein phosphorylations. 
20 Cyclic nucleotides (cAMP and cGMP) function as intracellular second messengers to 

transduce a variety of extracellular signals, including hormones, and light and 
neurotransmitters. Cyclic nucleotide phosphodiesterases (PDEs) degrade cyclic nucleotides 
to their corresponding monophosphates, thereby regulating the intracellular concentrations of 
cyclic nucleotides and their effects on signal transduction. At least seven families of 
25 mammalian PDEs have been identified based on substrate specificity and affinity, sensitivity 
to cofactors and sensitivity to inhibitory drugs. (Beavo, J. A. (1995) Physiological Reviews 
75: 725-748.) PDEs are composed of a catalytic domain of -270 amino acids, an N-terminal 
regulatory domain responsible for binding cofactors and, in some cases, a C-terminal domain 
with unknown function. Within the catalytic domain, there is approximately 30% amino acid 
30 identity between PDE families and -85-95% identity between isozymes of the same family. 
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Furthermore, within a family there is extensive similarity (>60%) outside the catalytic 
domain, while across families there is little or no sequence similarity. A variety of diseases 
have been attributed to increased PDE activity and inhibitors of PDEs have been used 
effectively as anti-inflammatory, antihypertensive, and antithrombotic agents. (Verghese, 
5 M.W. et al. (1995) Mol. Pharmacol. 47:1 164-1 171; and Banner, K.H.. and Page, CP. (1995) 
Eur. Respir. J. 8:996-1000.) 

Phospholipases (PLs) are enzymes that catalyze the removal of fatty acid residues 
from phosphoglycerides. PLs play an important role in transmembrane signal transduction 
and are named according to the specific ester bond in phosphoglycerides that is hydrolyzed, 
10 i.e., A 1? A 2 , C or D. PLA 2 cleaves the ester bond at position 2 of the glycerol moiety of 

membrane phospholipids giving rise to arachidonic acid. Arachidonic acid is the common 
3 precursor to four major classes of eicosanoids; prostaglandins, prostacyclins, thromboxanes 
?i and leukotrienes. Eicosanoids are signaling molecules involved in the contraction of smooth 
f | muscle, platelet aggregation, and pain and inflammatory responses. PLC is an important link 
Vis in certain receptor-mediated, signaling transduction pathways. Extracellular signaling 
J molecules including hormones, growth factors, neurotransmitters, and immunoglobulins bind 
St to their respective cell surface receptors and activate PLC. Activated PLC generates second 
W messenger molecules from the hydrolysis of inositol phospholipids that regulate cellular 
In processes, e.g., secretion, neural activity, metabolism and proliferation. (Alberts, B. et al. 
- lo (1994) Molecular Biolopv of The Cell . Garland Publishing, Inc., New York, NY, pp. 85, 211, 
239-240, 642-645.) 

The nucleotide cyclases, i.e., adenylate and guanylate cyclase, catalyze the synthesis 
of the cyclic nucleotides, cAMP and cGMP, from ATP and OTP, respectively. They act in 
concert with phosphodiesterases, which degrade cAMP and cGMP, to regulate the cellular 

25 levels of these molecules and their functions. cAMP and cGMP function as intracellular 
second messengers to transduce a variety of extracellular signals, e.g., hormones, and light 
and neurotransmitters. Adenylate cyclase is a plasma membrane protein that is coupled with 
various hormone receptors also located on the plasma membrane. Binding of a hormone to 
its receptor activates adenylate cyclase which, in turn, increases the levels of cAMP in the 

30 cytosol. The activation of other molecules by cAMP leads to the cellular effect of the 
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hormone. In a similar manner, guanylate cyclase participates in the process of visual 
excitation and phototransduction in the eye. (Stryer, L. (1988) Biochemistry W.H. Freeman 
and Co., New York, pp. 975-980, 1029-1035.) Cytokines are produced in response to 

cell perturbation. Some cytokines are produced as precursor forms, and some form multimers 
in order to become active. They are produced in groups and in patterns characteristic of the 
particular stimulus or disease, and the members of the group interact with one another and 
other molecules to produce an overall biological response. Interleukins, neurotrophins, 
growth factors, interferons, and chemokines are all families of cytokines which work in 
conjunction with cellular receptors to regulate cell proliferation and differentiation and to 
affect such activities, e.g., leukocyte migration and function, hematopoietic cell proliferation, 
temperature regulation, acute response to infections, tissue remodeling, and cell survival. 
Studies using antibodies or other drugs that modify the activity of a particular cytokine are 
used to elucidate the roles of individual cytokines in pathology and physiology. 

Chemokines are a small chemoattractant cytokines which are active in leukocyte 
trafficking. Initially, chemokines were isolated and purified from inflamed tissues, but 
recently several chemokines have been discovered through molecular cloning techniques. 
Chemokines have been shown to be active in cell activation and migration, angiogenic and 
angiostatic activities, suppression of hematopoiesis, HIV infectivity, and promoting Th-1(IL- 
2-, interferon y-stimulated) cytokine release. 

Chemokines generally contain 70-100 amino acids and are subdivided into four 
subfamilies based on the presence and arrangement of conserved CXC, CC, CX3C and C 
motifs. The CXC (alpha), CC (beta), and CX3C chemokines contain four conserved 
cysteines. The CC subfamily is active on monocytes, lymphocytes, eosinophils, and mast 
cells; the CXC subfamily, on neutrophils; CX3C and C subfamilies, on T-cells. Many of the 
CC chemokines have been characterized functionally as well as structurally. (Callard, R. and 
Gearing, A. (1994) The Cytokine Facts Book. Academic Press, New York, NY, pp. 181-190, 
210-213,223-227.) 

Growth and differentiation factors function in intercellular communication. Once 
secreted from the cell, some factors require oligomerization or association with ECM in order 
to function. Complex interactions among these factors and their receptors result in the 
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stimulation or inhibition of cell division, cell differentiation, cell signaling, and cell motility. 
Some factors act on their cell of origin (autocrine signaling); on neighboring cells (paracrine 
signaling); or on distant cells (endocrine signaling). 

There are three broad classes of growth and differentiation factors. The first class 
5 includes the large polypeptide growth factors, e.g., epidermal growth factor, fibroblast growth 
factor, transforming growth factor, insulin-like growth factor, and platelet-derived growth 
factor. Each of these defines a family of related molecules which stimulate cell proliferation 
for wound healing, bone synthesis and remodeling, and regeneration of epithelial, epidermal, 
and connective tissues, and induce differentiation of embryonic tissues. Nerve growth factor 
10 functions specifically as a neurotrophic factor, and all induce differentiation of embryonic 

tissues. The second class includes the hematopoietic growth factors which stimulate the 
5 proliferation and differentiation of blood cells such as B-lymphocytes, T-lymphocytes, 
t erythrocytes, platelets, eosinophils, basophils, neutrophils, macrophages, and their stem cell 
V precursors. These factors include colony-stimulating factors, erythropoietin, and cytokines, 
f5 e.g., interleukins, interferons (IFNs), and tumor necrosis factor (TNF). Cytokines are 

secreted by cells of the immune system and function in immunomodulation. The third class 
71 includes small peptide factors e.g., bombesin, vasopressin, oxytocin, endothelin, transferrin, 
'4 angiotensin II, vasoactive intestinal peptide, and bradykinin, which function as hormones to 
n regulate cellular functions other than proliferation. 

2'0 Growth and differentiation factors have been shown to play critical roles in neoplastic 

transformation of cells in vitro and in tumor progression in vivo . Inappropriate expression of 
growth factors by tumor cells may contribute to vascularization and metastasis of melanotic 
tumors. In hematopoiesis, growth factor misregulation can result in anemias, leukemias and 
lymphomas. Certain growth factors, e.g., IFN, are cytotoxic to tumor cells both in vivo and 

25 in vitro . Moreover, growth factors and/or their receptors are related both structurally and 
functionally related to oncoproteins. In addition, growth factors affect transcriptional 
regulation of both proto-oncogenes and oncosuppressor genes. (Pimentel, E. (1994) 
Handbook of Growth Factors . CRC Press, Ann Arbor, MI, pp. 6-25.) 

Proteolytic enzymes or proteases degrade proteins by reducing the activation energy 

30 needed for the hydrolysis of peptide bonds. The major families are the zinc, serine, cysteine, 
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thiol, and carboxyl proteases. 

Zinc proteases, e.g., carboxypeptidase A, have a zinc ion bound to the active site, 
recognize C-terminal residues that contain an aromatic or bulky aliphatic side chain, and 
hydrolyze the peptide bond adjacent to the C-terminal residues. Serine proteases have an 
active site serine residue and include digestive enzymes, e.g., trypsin and chymotrypsin, 
components of the complement and blood-clotting cascades, and enzymes that control the 
degradation and turnover of extracellular matrix (ECM) molecules. Subfamilies of serine 
proteases include tryptases (cleavage after arginine or lysine), aspases (cleavage after 
aspartate), chymases (cleavage after phenylalanine or leucine), metases (cleavage after 
methionine), and serases (cleavage after serine). Cysteine proteases (e.g. cathepsin) are 
produced by monocytes, macrophages and other immune cells and are involved in diverse 
cellular processes ranging from the processing of precursor proteins to intracellular 
degradation. Overproduction of these enzymes can cause the tissue destruction associated 
with rheumatoid arthritis and asthma. Thiol proteases, e.g., papain, contain an active site 
cysteine and are widely distributed within tissues. Thiol proteases effect catalysis through a 
thiol ester intermediate facilitated by a proximal histidine side chain. Carboxyl proteases, 
e.g., pepsin, are active only under acidic conditions (pH 2 to 3). The active site of pepsin 
contains two aspartate residues; when one aspartate is ionized and the other is not, the 
enzyme is active. A common feature of the carboxyl proteases is that they are inhibited by 
very low concentrations (10 10 M) of the inhibitor pepstatin. A substrate analog which 
induces structural changes at the active site of a protease functions as an antagonist or 
inhibitor. 

Guanosine triphosphate-binding proteins (G proteins) participate in intracellular 
signal transduction and control regulatory pathways through cell surface receptors. These 
receptors respond to hormones, growth factors, neuromodulators, or other signaling 
molecules, by binding GTP. Binding of GTP leads to the production of cAMP which 
controls phosphorylation and activation of other proteins. During this process, the hydrolysis 
of GTP acts as an energy source as well as an on-off switch for the GTPase activity. 

The G proteins are small proteins which consist of single 21-30 kDa polypeptides. 
They can be classified into five subfamilies: Ras, Rho, Ran, Rab, and ADP-ribosylation 
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factor. These proteins regulate cell growth, cell cycle control, protein secretion, and 
intracellular vesicle interaction. In particular, the Ras proteins are essential in transducing 
signals from receptor tyrosine kinases to serine/threonine kinases which control cell growth 
and differentiation. Mutant Ras proteins, which bind but can not hydrolyze GTP, are 
5 permanently activated and cause continuous cell proliferation or cancer. 

All five subfamilies share common structural features and four conserved motifs, I to 
IV. Motif I is the most variable and has the signature of GXXXXGK, in which lysine 
interacts with the p- and y -phosphate groups of GTP. Motif II, III, and IV have DTAGQE, 
NKXD, and EXSAX as their respective signatures and regulate the binding of g-phosphate, 

10 GTP, and the guanine base of GTP, respectively. Most of the membrane-bound G proteins 
require a carboxy terminal isoprenyl group (CAAX), added posttranslationally, for membrane 

? association and biological activity. The G proteins also have a variable effector region, 

I located between motifs I and II, which is characterized as the interaction site for guanine 

- nucleotide exchange factors or GTPase-activating proteins. 

^15 Eukaryotic cells are bound by a membrane and subdivided into membrane bound 

compartments. As membranes are impermeable to many ions and polar molecules, transport 
J of these molecules is mediated by ion channels, ion pumps, transport proteins, or pumps. 
J Symporters and antiporters regulate cytosolic pH by transporting ions and small molecules, 
3 e.g., amino acids, glucose, and drugs, across membranes; symporters transport small 
^20 molecules and ions in the same direction, and antiporters, in the opposite direction. 

Transporter superfamilies include facilitative transporters and active ATP binding cassette 
transporters involved in multiple-drug resistance and the targeting of antigenic peptides to 
MHC Class I molecules. These transporters bind to a specific ion or other molecule and 
undergo conformational changes in order to transfer the ion or molecule across a membrane. 
25 Transport can occur by a passive, concentration-dependent mechanism or can be linked to an 
energy source such as ATP hydrolysis or an ion gradient. 

Ion channels are formed by transmembrane proteins which form a lined passageway 
across the membrane through which water and ions, e.g., Na + , K + , Ca 2+ , and CI", enter and exit 
the cell. For example, chloride channels are involved in the regulation of the membrane 
30 electric potential as well as absorption and secretion of ions across the membrane. In 
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intracellular membranes of the Golgi apparatus and endocytic vesicles, chloride channels also 
regulate organelle pH. Electrophysiological and pharmacological studies suggest that a 
variety of chloride channels exist in different cell types and that many of these channels have 
one or more protein kinase phosphorylation sites. 

Ion pumps are ATPases which actively maintain membrane gradients. Ion pumps can 
be grouped into three classes, e.g., P, V, and F, according to their structure and function. All 
have one or more binding sites for ATP on the cytosolic face of the membrane. The P-class 
ion pumps consist of two a and two p transmembrane subunits, include Ca 2+ ATPase and 
Na + /K + ATPase, and function in transporting H + , Na + , K + , and Ca 2+ ions. The V- and F-class 
ion pumps have similar structures, a cytosolic domain formed by at least five extrinsic 
polypeptides and at least 2 transmembrane proteins, and only transport H + . F class FT pumps 
have been identified from the membranes of mitochondria and chloroplast, and V-class H + 
pumps regulate acidity inside lysosomes, endosomes, and plant vacuoles. 

A family of structurally related intrinsic membrane proteins known as facilitative 
glucose transporters catalyze the movement of glucose and other selected sugars across the 
plasma membrane. The proteins in this family contain a highly conserved, large 
transmembrane domain made of 12 transmembrane oc-helices, and several less conserved, 
asymmetric, cytoplasmic and exoplasmic domains. (Pessin, J. E., and Bell, G.I. (1992) Annu. 
Rev. Physiol. 54:911-930.) 

Amino acid transport is mediated by Na + dependent amino acid transporters. These 
transporters are involved in gastrointestinal and renal uptake of dietary and cellular amino 
acids and the re-uptake of neurotransmitters. Transport of cationic amino acids is mediated 
by the system y+ family members and the cationic amino acid transporter (CAT) family. 
Members of the CAT family share a high degree of sequence homology, and each contains 
12-14 putative transmembrane domains. (Ito, K. and Groudine, M. (1997) J. Biol. Chem. 
272:26780-26786.) 

Proton-coupled, 12 membrane-spanning domain transporters such as PEPT 1 and 
PEPT 2 are responsible for gastrointestinal absorption and for renal reabsorbtion of peptides 
using an electrochemical FT gradient as the driving force. A heterodimeric peptide 
transporter, consisting of TAP 1 and TAP 2, is associated with antigen processing. Peptide 
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antigens are transported across the membrane of the endoplasmic reticulum so they can be 
presented to the major histocompatibility complex class I molecules. Each TAP protein 
consists of multiple hydrophobic membrane spanning segments and a highly conserved 
ATP-binding cassette. (Boll, M. et al. (1996) Proc. Natl. Acad. Sci. 93:284-289.) 
5 Hormones are secreted molecules that circulate in the body fluids and bind to specific 

receptors on the surface of, or within, target tissue cells. Although they have diverse 
biochemical compositions and mechanisms of action, hormones can be grouped into two 
categories. One category consists of small lipophilic molecules that diffuse through the 
plasma membrane of target cells, bind to cytosolic or nuclear receptors, and form a complex 
10 alters gene expression. Examples of this category include retinoic acid, thyroxine, and the 
^ cholesterol derived steroid hormones, progesterone, estrogen, testosterone, Cortisol, and 
y ; aldosterone. These hormones have a long half-life, e.g., several hours to days, and long-term 
CJ effects of their target cells. Their solubility in the blood may be increased by their association 
"A with carrier molecules. Within the target cell nucleus, hormone/receptor complexes bind to 
;':;15 specific response elements in target gene regulatory regions. 

s A second category consists of hydrophilic hormones that function by binding to cell 

fi\ surface receptors and transducing the signal across the plasma membrane. Examples of this 

category include amino acid derivatives, such as catecholamines, e.g., epinephrine, 
; 2 norepinephrine, and histamine; peptide hormones, e.g., glucagon, insulin, gastrin, secretin, 
20 cholecystokinin, adrenocorticotropic hormone, follicle stimulating hormone, luteinizing 
hormone, thyroid stimulating hormone, parathormone, and vasopressin. Peptide hormones 
are synthesized as inactive forms and stored in secretory vesicles. These hormones are 
activated by protease cleavage before being released from the cell. Many hydrophilic 
hormones have a very short half-life and effect, e.g., seconds to hours, and are inactivated by 
25 proteases in the blood. (Lodish et al. (1995) Molecular Cell Biology . Scientific American 
Books Inc., New York, NY, pp. 856-864.) 

Neuropeptides and vasomediators (NP/VM) comprise a large family of endogenous 
signaling molecules. Included in the family are neurotransmitters such as bombesin, 
neuropeptide Y, neurotensin, neuromedin N, melanocortins, opioids, e.g., enkephalins, 
30 endorphins and dynorphins, galanin, somatostatin, tachykinins, vasopressin, and vasoactive 
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intestinal peptide, and circulatory system-borne signaling molecules, e.g., angiotensin, 
complement, calcitonin, endothelins, formyl-methionyl peptides, glucagon, cholecystokinin 
and gastrin. These proteins are synthesized as "pre-pro" molecules, and are activated and 
inactivated by proteolytic cleavage. NP/VMs can transduce signals directly, modulate the 
activity or release of other neurotransmitters and hormones, and act as catalytic enzymes in 
cascades. The effects of NP/VMs range from extremely brief or long-lasting (melanocortin- 
mediated changes in skin melanin). Regulatory molecules turn individual genes or 
groups of genes on and off in response to various inductive mechanisms of the cell or 
organism; act as transcription factors by determining whether or not transcription is initiated, 
enhanced, or repressed; and splice transcripts as dictated in a particular cell or tissue. 
Although they interact with short stretches of DNA scattered throughout the entire genome, 
most gene expression is regulated near the site at which transcription starts or within the open 
reading frame of the gene being expressed. The regulated stretches of the DNA can be simple 
and interact with only a single protein, or they can require several proteins acting as part of a 
complex to regulate gene expression. The external features of the double helix which provide 
recognition sites are hydrogen bond donor and acceptor groups, hydrophobic patches, major 
and minor grooves, and regular, repeated stretches of sequences which cause distinct bends in 
the helix. The surface features of the regulatory molecule are complementary to those of the 
DNA. 

Many of the transcription factors incorporate one of a set of DNA-binding structural 
motifs, each of which contains either a helices or 6 sheets and binds to the major groove of 
DNA. Seven of the structural motifs common to transcription factors are helix-turn-helix, 
homeodomains, zinc finger, steroid receptor, J3 sheets, leucine zipper, and helix-loop-helix. 
(Pabo, CO. and R.T. Sauer (1992) Ann. Rev. Biochem. 61:1053-95.) Other domains of 
transcription factors may form crucial contacts with the DNA. In addition, accessory proteins 
provide important interactions which may convert a particular protein complex to an activator 
or a repressor or may prevent binding. (Alberts, B. et al. (1994) Molecular Biology of the 
Cell. Garland Publishing Co, New York, NY pp. 401-474.) 

The discovery of new human signal peptide-containing proteins and the 
polynucleotides encoding these molecules satisfies a need in the art by providing new 
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compositions which are useful in the diagnosis, treatment, and prevention of cancer and 
immunological disorders. 

SUMMARY OF THE INVENTION 

The invention features a substantially purified human signal peptide-containing 
protein (SIGP), having an amino acid sequence selected from the group consisting of SEQ ID 
NO:l SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID 
NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:l 1, SEQ ID NO:12, SEQ 
ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, 
SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 
NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, 
SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID 
NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, 
SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID 
NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51, 
SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ ID 
NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, 
SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID 
NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ ID NO:72, SEQ ID NO:73, 
SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, and SEQ ID NO:77.. 

The invention further provides isolated and substantially purified polynucleotides 
encoding SIGP. In a particular aspect, the polynucleotide has a nucleic acid sequence 
selected from the group consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ 
ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, 
SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91, SEQ ID 
NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, 
SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101, SEQ ID NO:102, SEQ 
ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 107, SEQ ID 
NO:108, SEQ IDNO:109, SEQ IDNO:110, SEQ IDNO:lll, SEQ IDNO:112, SEQ ID 
NO:l 13, SEQ ID NO:114, SEQ ID NO:l 15, SEQ ID NO:l 16, SEQ ID NO:l 17, SEQ ID 
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NO:118, SEQ IDN0:119, SEQ IDNO:120, SEQ IDN0:121, SEQ IDNO:122, SEQ ID 
NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID 
NO:128, SEQ ID NO:129, SEQ ID NO:130, SEQ ID NO:131, SEQ ID NO:132, SEQ ID 
NO:133, SEQ ID NO:134, SEQ ID NO:135, SEQ ID NO:136, SEQ ID NO:137, SEQ ID 
NO:138, SEQ ID NO:139, SEQ ID NO:140, SEQ ID NO:141, SEQ ID NO:142, SEQ ID 
NO:143, SEQ IDNO:144, SEQ IDNO:145, SEQ IDNO:146, SEQ IDNO:147, SEQ ID 
NO:148, SEQ IDNO:149, SEQ IDNO:150, SEQ ID NO:151, SEQ IDNO:152, SEQ ID 
NO:153, and SEQ ID NO:154. 

In addition, the invention provides a polynucleotide, or fragment thereof, which 
hybridizes to any of the polynucleotides encoding an SIGP selected from the group consisting 
of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:l 1, SEQ 
ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID 
NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, 
SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID 
NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, 
SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID 
NO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, 
SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID 
NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, 
SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID 
NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ ID NO:72, 
SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, and SEQ ID NO:77. In 
another aspect, the invention provides a composition comprising isolated and purified 
polynucleotides selected from the group consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ 
ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, 
SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID 
NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, 
SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101, SEQ ID 



15 



PF-0459 US 

NO:102, SEQ ID NO:103, SEQ ID NO:104, SEQ ID NO:105, SEQ ID NO:106, SEQ ID 
NO:107, SEQ ID NO:108, SEQ ID NO:109, SEQ IDNO:110, SEQ ID NO: 111, SEQ ID 
NO:l 12, SEQ ID NO: 1 13, SEQ ID NO:l 14, SEQ ID NO: 1 15, SEQ ID NO:l 16, SEQ ID 
NO:117, SEQ IDNO:118, SEQ IDNO:119, SEQ IDNO:120, SEQ IDNO:121, SEQ ID 
5 NO:122, SEQ ID NO:123, SEQ ID NO:124, SEQ ID NO:125, SEQ ID NO:126, SEQ ID 
NO:127, SEQ ID NO:128, SEQ ID NO:129, SEQ ID NO:130, SEQ ID NO:131, SEQ ID 
NO:132, SEQ ID NO:133, SEQ ID NO:134, SEQ ID NO:135, SEQ ID NO:136, SEQ ID 
NO:137, SEQ ID NO:138, SEQ ID NO:139, SEQ ID NO:140, SEQ ID NO:141, SEQ ID 
NO:142, SEQ ID NO:143, SEQ ID NO:144, SEQ ID NO:145, SEQ ID NO:146, SEQ ID 
iO NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID NO:150, SEQ ID NO:151, SEQ ID 
NO:152, SEQ ID NO:153, and SEQ ID NO:154, or a fragment thereof. 

The invention further provides a polynucleotide comprising the complement, or 
fragments thereof, of any one of the polynucleotides encoding SIGP. In another aspect, the 
invention provides compositions comprising isolated and purified polynucleotides comprising 
15 the complement of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID 
NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, 
SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID 
NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, 
SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101, SEQ ID NO:102, SEQ ID NO:103, SEQ 
20 ID NO:104, SEQ ID NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID 
NO: 109, SEQ ID NO: 1 10, SEQ ID NO: 1 1 1, SEQ ID NO:l 12, SEQ ID NO:l 13, SEQ ID 
NO:l 14, SEQ ID NO:l 15, SEQ ID NO: 11 6, SEQ ID NO:l 17, SEQ ID NO:l 18, SEQ ID 
NO-.119, SEQ IDNO:120, SEQ ID NO:121, SEQ IDNO:122, SEQ ID NO:123, SEQ ID 
NO:124, SEQIDNO:125, SEQ IDNO:126, SEQ IDNO:127, SEQIDNO:128, SEQ ID 
25 NO:129, SEQIDNO:130, SEQIDNO:131, SEQIDNO:132, SEQ IDNO:133, SEQ ID 
NO:134, SEQ IDNO:135, SEQ ID NO:136, SEQ ID NO:137, SEQ ID NO:138, SEQ ID 
NO:139, SEQ IDNO:140, SEQ IDNO:141, SEQ IDNO:142, SEQ ID NO:143, SEQ ID 
NO:144, SEQ ID NO:145, SEQ ID NO:146, SEQ ID NO:147, SEQ ID NO:148, SEQ ID 
NO:149, SEQ ID NO:150, SEQ ID NO:151, SEQ ID NO:152, SEQ ID NO:153, and SEQ ID 
30 NO: 1 54, or fragments thereof. 
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The present invention further provides an expression vector containing at least a 
fragment of any one of the polynucleotides selected from the group consisting of SEQ ID 
NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, 
SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID 
NO:89, SEQ ID NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, 
SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID 
NO: 100, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID 
NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109, SEQ ID 
NO:110, SEQ ID NO:lll, SEQIDNO:112, SEQ ID NO:113, SEQ IDNO:114, SEQ ID 
NO:l 15, SEQ ID NO:l 16, SEQ ID NO:l 17, SEQ ID NO:l 18, SEQ ID NO:l 19, SEQ ID 
NO:120, SEQ ID NO:121, SEQ ID NO:122, SEQ ID NO:123, SEQ ID NO:124, SEQ ID 
NO:125, SEQ ID NO:126, SEQ ID NO:127, SEQ ID NO:128, SEQ ID NO:129, SEQ ID 
NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, SEQ ID 
NO:135, SEQ ID NO:136, SEQ ID N0.137, SEQ ID NO:138, SEQ ID NO:139, SEQ ID 
NO:140, SEQ ID NO:141, SEQ ID NO:142, SEQ ID NO:143, SEQ ID NO:144, SEQ ID 
NO:145, SEQ ID NO:146, SEQ ID NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID 
NO: 1 50, SEQ ID NO: 1 5 1 , SEQ ID NO: 1 52, SEQ ID NO: 1 53 , and SEQ ID NO: 1 54. In yet 
another aspect, the expression vector containing the polynucleotide is contained within a host 
cell. 

The invention also provides a method for producing a polypeptide or a fragment 
thereof, the method comprising the steps of: (a) culturing the host cell containing an 
expression vector containing at least a fragment of a polynucleotide encoding SIGP under 
conditions suitable for the expression of the polypeptide; and (b) recovering the polypeptide 
from the host cell culture. 

The invention also provides a pharmaceutical composition comprising a substantially 
purified SIGP in conjunction with a suitable pharmaceutical carrier. 

The invention further includes a purified antibody which binds to SIGP, as well as a 
purified agonist and a purified antagonist of SIGP. 

The invention also provides a method for treating or preventing a cancer associated 
with the decreased expression or activity of SIGP, the method comprising the step of 
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administering to a subject in need of such treatment an effective amount of a pharmaceutical 
composition containing SIGP. 

The invention also provides a method for treating or preventing a cancer associated 
with the increased expression or activity of SIGP, the method comprising the step of 
5 administering to a subject in need of such treatment an effective amount of an antagonist of 



The invention also provides a method for treating or preventing an immune response 
associated with the increased expression or activity of SIGP, the method comprising the step 
of administering to a subject in need of such treatment an effective amount of an antagonist of 



The invention also provides a method for detecting a nucleic acid sequence which 
'vJ encodes a human regulatory proteins in a biological sample, the method comprising the steps 
C of: a) hybridizing a nucleic acid sequence of the biological sample to a polynucleotide 
J~ sequence complementary to the polynucleotide encoding SIGP, thereby forming a 
I \95 hybridization complex; and b) detecting the hybridization complex, wherein the presence of 
a the hybridization complex correlates with the presence of the nucleic acid sequence encoding 

the human regulatory protein in the biological sample. 
. J The invention also provides a microarray containing at least a fragment of at least one 

: 3 of the polynucleotides encoding a polypeptide having an amino acid sequence selected from 
20 the group consisting of SEQ ID NO: 1 , SEQ ID NO:2, SEQ ID NO:3 ? SEQ ID NO:4, SEQ ID 



NO:5 ? SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID 
NO:ll, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ IDNO:15, SEQ IDNO:16, 
SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID 
NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26 ? SEQ ID NO:27, 

25 SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID 
NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, 
SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID 
NO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, 
SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID 

30 NO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, 



SIGP. 



10 



SIGP. 
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SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID 
NO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, 
SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, and SEQ 
IDNO:77. 

The invention also provides a method for detecting the expression level of a nucleic 
acid encoding a human regulatory protein in a biological sample, the method comprising the 
steps of hybridizing the nucleic acid sequence of the biological sample to a complementary 
polynucleotide, thereby forming hybridization complex; and determining expression of the 
nucleic acid sequence encoding a human regulatory protein in the biological sample by 
identifying the presence of the hybridization complex. In a preferred embodiment, prior to 
the hybridizing step, the nucleic acid sequences of the biological sample are amplified and 
labeled by the polymerase chain reaction. 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is 
understood that this invention is not limited to the particular methodology, protocols, cell 
lines, vectors, and reagents described, as these may vary. It is also to be understood that the 
terminology used herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention which will be limited only by the 
appended claims. 

It must be noted that as used herein and in the appended claims, the singular forms 
"a," "an," and "the" include plural reference unless the context clearly dictates otherwise. 
Thus, for example, a reference to "a host cell" includes a plurality of such host cells, and a 
reference to "an antibody" is a reference to one or more antibodies and equivalents thereof 
known to those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although any methods and materials similar or equivalent to those described herein 
can be used in the practice or testing of the present invention, the preferred methods, devices, 
and materials are now described. All publications mentioned herein are cited for the purpose 
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of describing and disclosing the cell lines, vectors, and methodologies which are reported in 
the publications and which might be used in connection with the invention. Nothing herein is 
to be construed as an admission that the invention is not entitled to antedate such disclosure 
by virtue of prior invention. 

DEFINITIONS 

"SIGP," as used herein, refers to the amino acid sequences of substantially purified 
SIGP obtained from any species, particularly a mammalian species, including bovine, ovine, 
porcine, murine, equine, and preferably the human species, from any source, whether natural, 
synthetic, semi-synthetic, or recombinant. 

The term "agonist," as used herein, refers to a molecule which, when bound to SIGP, 
increases or prolongs the duration of the effect of SIGP. Agonists may include proteins, 
nucleic acids, carbohydrates, or any other molecules which bind to and modulate the effect of 
SIGP. 

An "allele" or an "allelic sequence," as these terms are used herein, is an alternative 
form of the gene encoding SIGP. Alleles may result from at least one mutation in the nucleic 
acid sequence and may result in altered mRNAs or in polypeptides whose structure or 
function may or may not be altered. Any given natural or recombinant gene may have none, 
one, or many allelic forms. Common mutational changes which give rise to alleles are 
generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of 
these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

"Altered" nucleic acid sequences encoding SIGP, as described herein, include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting in a 
polynucleotide the same SIGP or a polypeptide with at least one functional characteristic of 
SIGP. Included within this definition are polymorphisms which may or may not be readily 
detectable using a particular oligonucleotide probe of the polynucleotide encoding SIGP, and 
improper or unexpected hybridization to alleles, with a locus other than the normal 
chromosomal locus for the polynucleotide sequence encoding SIGP. The encoded protein 
may also be "altered," and may contain deletions, insertions, or substitutions of amino acid 
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residues which produce a silent change and result in a functionally equivalent SIGP. 
Deliberate amino acid substitutions may be made on the basis of similarity in polarity, 
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the 
residues, as long as the biological or immunological activity of SIGP is retained. For 
example, negatively charged amino acids may include aspartic acid and glutamic acid, 
positively charged amino acids may include lysine and arginine, and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include leucine, 
isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine and threonine; 
and phenylalanine and tyrosine. 

The terms "amino acid" or "amino acid sequence," as used herein, refer to an 
oligopeptide, peptide, polypeptide, or protein sequence, or a fragment of any of these, and to 
naturally occurring or synthetic molecules. In this context, "fragments", "immunogenic 
fragments", or "antigenic fragments" refer to fragments of SIGP which are preferably about 5 
to about 15 amino acids in length and which retain some biological activity or immunological 
activity of SIGP. Where "amino acid sequence" is recited herein to refer to an amino acid 
sequence of a naturally occurring protein molecule, "amino acid sequence" and like terms are 
not meant to limit the amino acid sequence to the complete native amino acid sequence 
associated with the recited protein molecule. 

"Amplification," as used herein, relates to the production of additional copies of a 
nucleic acid sequence. Amplification is generally carried out using polymerase chain reaction 
(PCR) technologies well known in the art. (See, e.g., Dieffenbach, C.W. and G.S. Dveksler 
(1995) PCR Primer, a Laboratory Manual Cold Spring Harbor Press, Plainview, NY, pp. 1-5.) 

The term "antagonist," as it is used herein, refers to a molecule which, when bound to 
SIGP, decreases the amount or the duration of the effect of the biological or immunological 
activity of SIGP. Antagonists may include proteins, nucleic acids, carbohydrates, antibodies, 
or any other molecules which decrease the effect of SIGP. 

As used herein, the term "antibody" refers to intact molecules as well as to fragments 
thereof, such as Fa, F(ab') 2 , and Fv fragments, which are capable of binding the epitopic 
determinant. Antibodies that bind SIGP polypeptides can be prepared using intact 
polypeptides or using fragments containing small peptides of interest as the immunizing 
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antigen. The polypeptide or oligopeptide used to immunize an animal (e.g., a mouse, a rat, or 
a rabbit) can be derived from the translation of RNA, or synthesized chemically, and can be 
conjugated to a carrier protein if desired. Commonly used carriers that are chemically 
coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet 
hemocyanin (KLH). The coupled peptide is then used to immunize the animal. 

The term "antigenic determinant," as used herein, refers to that fragment of a 
molecule (i.e., an epitope) that makes contact with a particular antibody. When a protein or a 
fragment of a protein is used to immunize a host animal, numerous regions of the protein may 
induce the production of antibodies which bind specifically to antigenic determinants (given 
regions or three-dimensional structures on the protein). An antigenic determinant may 
compete with the intact antigen (i.e., the immunogen used to elicit the immune response) for 
binding to an antibody. 

The term "antisense " as used herein, refers to any composition containing a nucleic 
acid sequence which is complementary to a specific nucleic acid sequence. The term 
"antisense strand" is used in reference to a nucleic acid strand that is complementary to the 
"sense" strand. Antisense molecules may be produced by any method including synthesis or 
transcription. Once introduced into a cell, the complementary nucleotides combine with 
natural sequences produced by the cell to form duplexes and to block either transcription or 
translation. The designation "negative" can refer to the antisense strand, and the designation 
"positive" can refer to the sense strand. 

As used herein, the term "biologically active " refers to a protein having structural, 
regulatory, or biochemical functions of a naturally occurring molecule. Likewise, 
"immunologically active" refers to the capability of the natural, recombinant, or synthetic 
SIGP, or of any oligopeptide thereof, to induce a specific immune response in appropriate 
animals or cells and to bind with specific antibodies. 

The terms "complementary" or "complementarity," as used herein, refer to the natural 
binding of polynucleotides under permissive salt and temperature conditions by base pairing. 
For example, the sequence "A-G-T" binds to the complementary sequence "T-C-A." 
Complementarity between two single-stranded molecules may be "partial," such that only 
some of the nucleic acids bind, or it may be "complete," such that total complementarity 
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exists between the single stranded molecules. The degree of complementarity between 
nucleic acid strands has significant effects on the efficiency and strength of the hybridization 
between the nucleic acid strands. This is of particular importance in amplification reactions, 
which depend upon binding between nucleic acids strands, and in the design and use of 
peptide nucleic acid (PNA) molecules. 

A "composition comprising a given polynucleotide sequence" or a "composition 
comprising a given amino acid sequence," as these terms are used herein, refer broadly to any 
composition containing the given polynucleotide or amino acid sequence. The composition 
may comprise a dry formulation, an aqueous solution, or a sterile composition. Compositions 
comprising polynucleotides encoding SIGP, e.g., SEQ ID NO:78, SEQ ID NO:79, SEQ ID 
NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, 
SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID 
NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, 
SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101, SEQ ID 
NO:102, SEQ ID NO:103, SEQ ID NO:104, SEQ ID NO:105, SEQ ID NO.106, SEQ ID 
NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO: 110, SEQ IDNO:lll, SEQ ID 
NO:l 12, SEQ ID NO:l 13, SEQ ID NO:l 14, SEQ ID NO:l 15, SEQ ID NO:l 16, SEQ ID 
NO:117, SEQ ID NO:l 18, SEQ ID NO:l 19, SEQ ID NO:120, SEQ ID NO:121, SEQ ID 
NO:122, SEQ ID NO:123, SEQ ID NO:124, SEQ ID NO:125, SEQ ID NO:126, SEQ ID 
NO:127, SEQ IDNO:128, SEQ IDNO:129, SEQ ID NO:130, SEQ IDNO:131, SEQ ID 
NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID 
NO:137, SEQ IDNO:138, SEQ IDNO:139, SEQ ID NO:140, SEQ ID NO:141, SEQ ID 
NO:142, SEQ ID NO:143, SEQ ID NO:144, SEQ ID NO:145, SEQ ID NO:146, SEQ ID 
NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, SEQ ID NO: 151, SEQ ID 
NO: 152, SEQ ID NO: 153, and SEQ ID NO: 154, or fragments thereof, may be employed as 
hybridization probes. The probes may be stored in freeze-dried form and may be associated 
with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be deployed 
in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., SDS) and other 
components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.). 

The phrase "consensus sequence," as used herein, refers to a nucleic acid sequence 
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which has been resequenced to resolve uncalled bases, extended using XL-PCR™ (Perkin 
Elmer, Norwalk, CT) in the 5 f and/or the 3' direction, and resequenced, or which has been 
assembled from the overlapping sequences of more than one Incyte Clone using a computer 
program for fragment assembly, such as the GEL VIEW™ Fragment Assembly system (GCG, 
5 Madison, WI). Some sequences have been both extended and assembled to produce the 
consensus sequence . 

As used herein, the term "correlates with expression of a polynucleotide" indicates 
that the detection of the presence of nucleic acids, the same or related to a nucleic acid 
sequence encoding SIGP, by northern analysis is indicative of the presence of nucleic acids 
10 encoding SIGP in a sample, and thereby correlates with expression of the transcript from the 
polynucleotide encoding SIGP. 
£ The term "SIGP" refers to any or all of the human polypeptides, SIGP-1, SIGP-2, 

I SIGP-3, SIGP-4, SIGP-5, SIGP-6, SIGP-7, SIGP-8, SIGP-9, SIGP-1 0, SIGP-1 1, SIGP-12, 
~ SIGP-13, SIGP-14, SIGP-15, SIGP-16, SIGP-17, SIGP-1 8, SIGP-1 9, SIGP-20, SIGP-21, 
§15 SIGP-22, SIGP-23, SIGP-24, SIGP-25, SIGP-26, SIGP-27, SIGP-28, SIGP-29, SIGP-30, 
il SIGP-31, SIGP-32, SIGP-33, SIGP-34, SIGP-35, SIGP-36, SIGP-37, SIGP-38, SIGP-39, 
* SIGP-40, SIGP-41, SIGP-42, SIGP-43, SIGP-44, SIGP-45, SIGP-46, SIGP-47, SIGP-48, 
£ SIGP-49, SIGP-50, SIGP-51, SIGP-52, SIGP-53, SIGP-54, SIGP-55, SIGP-56, SIGP-57, 
C SIGP-58, SIGP-59, SIGP-60, SIGP-61, SIGP-62, SIGP-63, SIGP-64, SIGP-65, SIGP-66, 
So SIGP-67, SIGP-68, SIGP-69, SIGP-70, SIGP-71, SIGP-72, SIGP-73, SIGP-74, SIGP-75, 
SIGP-76, and SIGP-77. 

A "deletion," as the term is used herein, refers to a change in the amino acid or 
nucleotide sequence that results in the absence of one or more amino acid residues or 
nucleotides. 

25 The term "derivative," as used herein, refers to the chemical modification of SIGP, of 

a polynucleotide sequence encoding SIGP, or of a polynucleotide sequence complementary to 
a polynucleotide sequence encoding SIGP. Chemical modifications of a polynucleotide 
sequence can include, for example, replacement of hydrogen by an alkyl, acyl, or amino 
group. A derivative polynucleotide encodes a polypeptide which retains at least one 

30 biological or immunological function of the natural molecule. A derivative polypeptide is 
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one modified by glycosylation, pegylation, or any similar process that retains at least one 
biological or immunological function of the polypeptide from which it was derived. 

The term "homology," as used herein, refers to a degree of complementarity. There 
may be partial homology or complete homology. The word "identity" may substitute for the 
5 word "homology." A partially complementary sequence that at least partially inhibits an 
identical sequence from hybridizing to a target nucleic acid is referred to as "substantially 
homologous." The inhibition of hybridization of the completely complementary sequence to 
the target sequence may be examined using a hybridization assay (Southern or northern blot, 
solution hybridization, and the like) under conditions of reduced stringency. A substantially 
10 homologous sequence or hybridization probe will compete for and inhibit the binding of a 
completely homologous sequence to the target sequence under conditions of reduced 
stringency. This is not to say that conditions of reduced stringency are such that non-specific 
binding is permitted, as reduced stringency conditions require that the binding of two 
sequences to one another be a specific (i.e., a selective) interaction. The absence of non- 
15 specific binding may be tested by the use of a second target sequence which lacks even a 
partial degree of complementarity (e.g., less than about 30% homology or identity). In the 
absence of non-specific binding, the substantially homologous sequence or probe will not 
hybridize to the second non-complementary target sequence. 

The phrases "percent identity" or "% identity" refer to the percentage of sequence 
20 similarity found in a comparison of two or more amino acid or nucleic acid sequences. 
Percent identity can be determined electronically, e.g., by using the MegAlign program 
(Lasergene software package, DNASTAR, Inc., Madison WI). The MegAlign program can 
create alignments between two or more sequences according to different methods, e.g., the 
Clustal Method. (Higgins, D.G. and Sharp, P.M. (1988) Gene 73:237-244.) The Clustal 
25 algorithm groups sequences into clusters by examining the distances between all pairs. The 
clusters are aligned pairwise and then in groups. The percentage similarity between two 
amino acid sequences, e.g., sequence A and sequence B, is calculated by dividing the length 
of sequence A, minus the number of gap residues in sequence A, minus the number of gap 
residues in sequence B, into the sum of the residue matches between sequence A and 
30 sequence B, times one hundred. Gaps of low or of no homology between the two amino acid 
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sequences are not included in determining percentage similarity. Percent identity between 
nucleic acid sequences can also be calculated by the Clustal Method, or by other methods 
known in the art, such as the Jotun Hein Method. (See, e.g., Hein, J. (1990) Methods in 
Enzymology 183:626-645.) Identity between sequences can also be determined by other 

5 methods known in the art, e.g., by varying hybridization conditions. 

"Human artificial chromosomes" (HACs), as described herein, are linear 
microchromosomes which may contain DNA sequences of about 6 kb to 10 Mb in size, and 
which contain all of the elements required for stable mitotic chromosome segregation and 
maintenance. (See, e.g., Harrington, J.J. et al. (1997) Nat Genet 15:345-355.) 

10 The term "humanized antibody," as used herein, refers to antibody molecules in which 

the amino acid sequence in the non-antigen binding regions has been altered so that the 
antibody more closely resembles a human antibody, and still retains its original binding 
ability. 



As used herein, the term "hybridization complex" as used herein, refers to a complex 
formed between two nucleic acid sequences by virtue of the formation of hydrogen bonds 
between complementary bases. A hybridization complex may be formed in solution (e.g., C 0 t 
or Rot analysis) or formed between one nucleic acid sequence present in solution and another 
20 nucleic acid sequence immobilized on a solid support (e.g., paper, membranes, filters, chips, 
pins or glass slides, or any other appropriate substrate to which cells or their nucleic acids 
have been fixed). 

The words "insertion" or "addition," as used herein, refer to changes in an amino acid 
or nucleotide sequence resulting in the addition of one or more amino acid residues or 
25 nucleotides, respectively, to the sequence found in the naturally occurring molecule. 

"Immune response" can refer to conditions associated with inflammation, trauma, 
immune disorders, or infectious or genetic disease, etc. These conditions can be 
characterized by expression of various factors, e.g., cytokines, chemokines, and other 
signaling molecules, which may affect cellular and systemic defense systems. 
30 The term "microarray," as used herein, refers to an array of distinct polynucleotides or 
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"Hybridization," as the term is used herein, refers to any process by which a strand of 
nucleic acid binds with a complementary strand through base pairing. 
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oligonucleotides arrayed on a substrate, such as paper, nylon or any other type of membrane, 
filter, chip, glass slide, or any other suitable solid support. 

The term "modulate," as it appears herein, refers to a change in the activity of SIGP. 
For example, modulation may cause an increase or a decrease in protein activity, binding 
5 characteristics, or any other biological, functional, or immunological properties of SIGP. 

The phrases "nucleic acid" or "nucleic acid sequence," as used herein, refer to an 
oligonucleotide, nucleotide, polynucleotide, or any fragment thereof, to DNA or RNA of 
genomic or synthetic origin which may be single-stranded or double- stranded and may 
represent the sense or the antisense strand, to peptide nucleic acid (PNA), or to any DN A-like 
10 or RNA-like material. In this context, "fragments" refers to those nucleic acid sequences 
which are greater than about 60 nucleotides in length, and most preferably are at least about 
100 nucleotides, at least about 1000 nucleotides, or at least about 10,000 nucleotides in 
length. 

The terms "operably associated" or "operably linked," as used herein, refer to 

15 functionally related nucleic acid sequences. A promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the encoded 
polypeptide. While operably associated or operably linked nucleic acid sequences can be 
contiguous and in reading frame, certain genetic elements, e.g., repressor genes, are not 
contiguously linked to the encoded polypeptide but still bind to operator sequences that 

20 control expression of the polypeptide. 

The term "oligonucleotide," as used herein, refers to a nucleic acid sequence of at 
least about 6 nucleotides to 60 nucleotides, preferably about 15 to 30 nucleotides, and most 
preferably about 20 to 25 nucleotides, which can be used in PCR amplification or in a 
hybridization assay or microarray. As used herein, the term "oligonucleotide" is substantially 

25 equivalent to the terms "amplimers," "primers," "oligomers," and "probes," as these terms are 
commonly defined in the art. 

"Peptide nucleic acid" (PNA), as used herein, refers to an antisense molecule or 
anti-gene agent which comprises an oligonucleotide of at least about 5 nucleotides in length 
linked to a peptide backbone of amino acid residues ending in lysine. The terminal lysine 

30 confers solubility to the composition. PNAs preferentially bind complementary single 
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stranded DNA and RNA and stop transcript elongation, and may be pegylated to extend their 
lifespan in the cell. (See, e.g., Nielsen, P.E. et aL (1993) Anticancer Drug Des. 8:53-63.) 

The term "sample," as used herein, is used in its broadest sense. A biological sample 
suspected of containing nucleic acids encoding SIGP, or fragments thereof, or SIGP itself 
5 may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane 
isolated from a cell; a cell; genomic DNA, RNA, or cDNA, in solution or bound to a solid 
support; a tissue; a tissue print; etc. 

As used herein, the terms "specific binding" or "specifically binding" refer to that 
interaction between a protein or peptide and an agonist, an antibody, or an antagonist. The 

10 interaction is dependent upon the presence of a particular structure of the protein recognized 
by the binding molecule (i.e., the antigenic determinant or epitope). For example, if an 
antibody is specific for epitope "A," the presence of a polypeptide containing the epitope A, 
or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody 
will reduce the amount of labeled A that binds to the antibody. 

15 As used herein, the term "stringent conditions" refers to conditions which permit 

hybridization between polynucleotide sequences and the claimed polynucleotide sequences. 
Suitably stringent conditions can be defined by, for example, the concentrations of salt or 
formamide in the prehybridization and hybridization solutions, or by the hybridization 
temperature, and are well known in the art. In particular, stringency can be increased by 

20 reducing the concentration of salt, increasing the concentration of formamide, or raising the 
hybridization temperature. 

For example, hybridization under high stringency conditions could occur in about 
50% formamide at about 37°C to 42°C. Hybridization could occur under reduced stringency 
conditions in about 35% to 25% formamide at about 30°C to 35°C. In particular, 

25 hybridization could occur under high stringency conditions at 42°C in 50% formamide, 5X 
SSPE, 0.3% SDS, and 200 yug/ml sheared and denatured salmon sperm DNA. Hybridization 
could occur under reduced stringency conditions as described above, but in 35% formamide 
at a reduced temperature of 35°C. The temperature range corresponding to a particular level 
of stringency can be further narrowed by calculating the purine to pyrimidine ratio of the 

30 nucleic acid of interest and adjusting the temperature accordingly. Variations on the above 



28 



PF-0459 US 

ranges and conditions are well known in the art. 

The term "substantially purified," as used herein, refers to nucleic acid or amino acid 
sequences that are removed from their natural environment and are isolated or separated, and 
are at least about 60% free, preferably about 75% free, and most preferably about 90% free 
5 from other components with which they are naturally associated. 

A "substitution," as used herein, refers to the replacement of one or more amino acids 
or nucleotides by different amino acids or nucleotides, respectively. 

"Transformation," as defined herein, describes a process by which exogenous DNA 
enters and changes a recipient cell. Transformation may occur under natural or artificial 
10 conditions according to various methods well known in the art, and may rely on any known 
method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic 
host cell. The method for transformation is selected based on the type of host cell being 
C transformed and may include, but is not limited to, viral infection, electroporation, heat 
'jg shock, lipofection, and particle bombardment. The term "transformed" cells includes stably 
: Jfl5 transformed cells in which the inserted DNA is capable of replication either as an 
* autonomously replicating plasmid or as part of the host chromosome, and refers to cells 

y which transiently express the inserted DNA or RNA for limited periods of time. 
7 A "variant" of SIGP, as used herein, refers to an amino acid sequence that is altered 

^ by one or more amino acids. The variant may have "conservative" changes, wherein a 
20 substituted amino acid has similar structural or chemical properties (e.g., replacement of 
leucine with isoleucine). More rarely, a variant may have "nonconservative" changes (e.g., 
replacement of glycine with tryptophan). Analogous minor variations may also include 
amino acid deletions or insertions, or both. Guidance in determining which amino acid 
residues may be substituted, inserted, or deleted without abolishing biological or 
25 immunological activity may be found using computer programs well known in the art, for 
example, DNASTAR software. 

THE INVENTION 

The invention is based on the discovery of new human signal peptide-containing 
30 proteins, collectively referred to as SIGP and individually as SIGP-1, SIGP-2, SIGP-3, 
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SIGP-4, SIGP-5, SIGP-6, SIGP-7, SIGP-8, SIGP-9, SIGP-10, SIGP-11, SIGP-12, SIGP-13, 
SIGP-14, SIGP-15, SIGP-16, SIGP-17, SIGP-18, SIGP-19, SIGP-20, SIGP-21, SIGP-22, 
SIGP-23, SIGP-24, SIGP-25, SIGP-26, SIGP-27, SIGP-28, SIGP-29, SIGP-30, SIGP-31, 
SIGP-32, SIGP-33, SIGP-34, SIGP-35, SIGP-36, SIGP-37, SIGP-38, SIGP-39, SIGP-40, 
5 SIGP-41 , SIGP-42, SIGP-43, SIGP-44, SIGP-45, SIGP-46, SIGP-47, SIGP-48, SIGP-49, 
SIGP-50, SIGP-5 1, SIGP-52, SIGP-53, SIGP-54, SIGP-55, SIGP-56, SIGP-57, SIGP-58, 
SIGP-59, SIGP-60, SIGP-61, SIGP-62, SIGP-63, SIGP-64, SIGP-65, SIGP-66, SIGP-67, 
SIGP-68, SIGP-69, SIGP-70, SIGP-7 1, SIGP-72, SIGP-73, SIGP-74, SIGP-75, SIGP-76, and 
SIGP-77; the polynucleotides encoding SIGP (SEQ ID NO:78, SEQ ID NO:79, SEQ ID 
10 NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, 
SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID 
*B NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, 
S SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100, SEQ ID NO:101, SEQ ID 
l J- NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 106, SEQ ID 
&15 NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO:l 10, SEQ ID NO:l 1 1, SEQ ID 
NO:l 12, SEQ ID NO: 1 13, SEQ ID NO:l 14, SEQ ID NO.l 15, SEQ ID NO.l 16, SEQ ID 
m NO: 117, SEQ ID NO:l 18, SEQ ID NO:l 19, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID 
f ? NO:122, SEQ ID NO:123, SEQ ID NO:124, SEQ ID NO:125, SEQ ID NO:126, SEQ ID 
NO:127, SEQ ID NO:128, SEQ ID NO:129, SEQ ID NO:130, SEQ ID NO:131, SEQ ID 
% 20 NO:132, SEQ IDNO:133, SEQ IDNO:134, SEQ IDNO:135, SEQ IDNO:136, SEQ ID 
NO:137, SEQ IDNO:138, SEQ IDNO:139, SEQIDNO:140, SEQIDNO:141, SEQ ID 
NO:142, SEQ IDNO:143, SEQ IDNO:144, SEQ IDNO:145, SEQ IDNO:146, SEQ ID 
NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID NO.150, SEQ ID NO:151, SEQ ID 
NO: 152, SEQ ID NO: 153, and SEQ ID NO: 154); and the use of these compositions for the 
25 diagnosis, treatment, or prevention of cancer and immunological disorders. Table 1 shows 
the sequence identification numbers, Incyte Clone identification number, cDNA library, 
NCBI sequence identifier and GenBank species description for each of the human signal 
peptide-containing proteins disclosed herein. 

30 Nucleic acids encoding the SIGP-1 of the present invention were first identified in 
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Incyte Clone 305841 from the heart tissue cDNA library (HEARNOT01) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:78, was 
derived from Incyte Clones 305841 (HEARNOT01), 22049 (ADENINB01),168880 
(LIVRNOT01), 1321915 (BLADNOT04), and the shotgun sequences SAWA02804, 
SAWA02781, SAWA01969, and SAWA01937. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 1 . SIGP-1 is 348 amino acids in length and has a potential 
amidation site at Q120; a potential N-glycosylation site at N181; two potential casein kinase 
II phosphorylation sites at SI 9 and T279; a potential glycosaminoglycan attachment site at 
S35; and three potential protein kinase C phosphorylation sites at S19, S268, and S343. 
SIGP-1 shares 56% identity with human GP36b glycoprotein (GI 505652). The fragment of 
SEQ ID NO:78 including the 5' region from about nucleotide 1 17 to about nucleotide 161 is 
useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive, neural, cardiovascular, hematopoietic and immune, and developmental cDNA 
libraries. Approximately 42% of these libraries are associated with neoplastic disorders, 28% 
with inflammation, and 21% with cell proliferation. 

Nucleic acids encoding the SIGP-2 of the present invention were first identified in 
Incyte Clone 322866 from the eosinophil cDNA library (EOSIHET02) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:79, was 
derived from Incyte Clones 322866 (EOSIHET02), 470107 (MMLR1DT01), 873933 
(LUNGAST01), and 2268817 (UTRSNOT02) 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:2. SIGP-2 is 194 amino acids in length and has two potential 
N-glycosylation sites at N129 and N148; two potential casein kinase II phosphorylation sites 
at S74 and S151; four potential protein kinase C phosphorylation sites at S5, S74, S130, and 
SI 63; a potential tyrosine kinase phosphorylation site at Yl 71; two potential prokaryotic 
membrane lipoprotein lipid attachment sites at Fl 5 and S61 ; and a transmembrane 4 protein 
family signature from G60 to L82. SIGP-2 shares 90% identity with CD53, a human cell 
surface antigen (GI 180141). The fragment of SEQ ID NO:79 from about nucleotide 624 to 
about nucleotide 686 is useful for hybridization. Northern analysis shows the expression of 
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this sequence in hematopoietic and immune, gastrointestinal, cardiovascular, reproductive, 
musculoskeletal, and neural cDNA libraries. Approximately 54% of these libraries are 
associated with inflammation, 39% with neoplastic disorders, and 1 1% with cell proliferation. 

Nucleic acids encoding the SIGP-3 of the present invention were first identified in 
Incyte Clone 546656 from the bronchial epithelium primary cell line cDNA library 
(BEPINOT01) using a computer search for amino acid sequence alignments. A consensus 
sequence, SEQ ID NO:80, was derived from Incyte Clones 546656 (BEPINOT01), 1316266 
(BLADTUT02), 2095988 (BRAITUT02), 1318172 (BLADNOT04), 2809506 
(TLYMNOT04), 1293412 and 1293630 (PGANNOT03), 2585048 (BRAITUT22), 2941370 
(HEAONOT03), 2297230 (BRSTNOT05), 1233586 (LUNGFET03), and the shotgun 
sequence SAEA02986. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:3. SIGP-3 is 342 amino acids in length and has a potential 
amidation site at H4; a potential N-glycosylation site atN23; seven potential casein kinase II 
phosphorylation sites at S38, T90, T105, T124, S139, T284, and T324; three potential protein 
kinase C phosphorylation sites at S25, T71, and S200; two potential tyrosine kinase 
phosphorylation sites at Y13 and Y69; and a beta-transducin family Trp-Asp repeats 
signature sequence from 1282 to 1296. SIGP-3 shares 100% identity with human HANI 1 (GI 
2290530). The fragment of SEQ ID NO:80 from about nucleotide 107 to about nucleotide 
139 is useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive, cardiovascular, hematopoietic and immune, neural, urologic, and 
developmental cDNA libraries. Approximately 43% of these libraries are associated with 
neoplastic disorders, 25% with inflammation, and 20% with cell proliferation. 

Nucleic acids encoding the SIGP-4 of the present invention were first identified in 
Incyte Clone 693453 from the synovial membrane cDNA library (SYNORAT03) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:81, was derived from Incyte Clones 693453 (SYNORAT03), 2505458 (CONUTUT01), 
1527363 (UCMCL5T01), 1275308 (TESTTUT02), 1377126 (LUNGNOT10), 538256 
(LNODNOT02), 3125441 (LNODNOT05), 1955296 (CONNNOT01), 1821536 
(GBLATUT01), 2055631 (BEPINOT01), and 2028161 (KERANOT02). 
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In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:4. SIGP-4 is 656 amino acids in length and has a potential 
N-glycosylation site at N73; nine potential casein kinase II phosphorylation sites at SI 40, 
S191, T250, T252, S330, S340, S517, S617, and T630; a potential leucine zipper pattern 

5 from L430 to L45 1 ; four potential N-myristoylation sites at G77 ? G246, G484, and A65 1 ; 
eleven potential protein kinase C phosphorylation sites at SI 8, T90, S93, T318, S490, S503, 
S532, T565, T608, S609, and T629; and a potential tyrosine kinase phosphorylation site at 
Y326. SIGP-4 shares 20% identity with Caenorhabditis elegans protein encoded by T01 G9.4 
(GI 1419461). The fragment of SEQ ID NO:81 from about nucleotide 202 to about 

10 nucleotide 255 is useful for hybridization. Northern analysis shows the expression of this 
sequence in reproductive, hematopoietic and immune, neural, and developmental cDNA 
libraries. Approximately 40% of these libraries are associated with neoplastic disorders, 30% 
with inflammation, and 30% with cell proliferation. 

Nucleic acids encoding the SIGP-5 of the present invention were first identified in 

15 Incyte Clone 866885 from the brain tumor cDNA library (BRAITUT03) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:82, was 
derived from Incyte Clones 866885 (BRAITUT03), 2991983 (KIDNFET02), 067954 
(HUVESTB01), and 1499109 (SINTBST01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 

20 acid sequence of SEQ ID NO:5. SIGP-5 is 236 amino acids in length and has a potential 
N-glycosylation site at N199; two potential casein kinase II phosphorylation sites at S8 and 
T72; a potential N-myristoylation site at G169; and three potential protein kinase C 
phosphorylation sites at T43, S96, and T201 . SIGP-5 shares 24% identity with rat syntaxin 
(GI 1488683). The fragment of SEQ ID NO:82 from about nucleotide 43 to about nucleotide 

25 93 is useful for hybridization. Northern analysis shows the expression of this sequence in 
hematopoietic and immune, reproductive, gastrointestinal, neural, cardiovascular, and 
developmental cDNA libraries. Approximately 43% of these libraries are associated with 
neoplastic disorders, 26% with inflammation, and 19% with cell proliferation. 

Nucleic acids encoding the SIGP-6 of the present invention were first identified in 

30 Incyte Clone 1242271 from the lung tissue cDNA library (LUNGNOT03) using a computer 
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search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:83, was 
derived from Incyte Clones 1242271 (LUNGNOT03), 968114 (BRSTNOT05), 1251728 
(LUNGFET03), and the shotgun sequence SAZA00142. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:6. SIGP-6 is 195 amino acids in length and has a potential 
cAMP- and cGMP-dependent protein kinase phosphorylation site at S79; six potential casein 
kinase II phosphorylation sites at S79, T85, SI 13, T166, T171, and T188; three potential 
protein kinase C phosphorylation sites at S20, SI 50, and SI 85; and a potential mitochondrial 
energy transfer proteins signature from P25 to Y33. The fragment of SEQ ID NO: 83 from 
about nucleotide 98 to about nucleotide 133 is useful for hybridization. Northern analysis 
shows the expression of this sequence in urologic, neural, reproductive, and cardiovascular 
cDNA libraries. Approximately 50% of these libraries are associated with neoplastic 
disorders, 14% with inflammation, and 21% with cell proliferation. 

Nucleic acids encoding the SIGP-7 of the present invention were first identified in 
Incyte Clone 1255027 from the fetal lung cDNA library ( LUNGFET03) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 84, was 
derived from Incyte Clones 1255027 (LUNGFET03), 2055704 (BEPINOT01), 1351096 
(LATRTUT02), 835188 (PROSNOT07), and 1695810 (COLNNOT23). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:7. SIGP-7 is 608 amino acids in length and has a potential 
amidation site at Tl 12; five potential N-glycosylation sites at N73, Nl 10, N410, N436, and 
N478; two potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at 
S123 and S185; ten potential casein kinase II phosphorylation sites at T2, S75, S166, SI 70, 
S185, S274, S463, S505, S517, and T588; and thirteen potential protein kinase C 
phosphorylation sites at T19, S32, S46, Tl 12, T221, S274, S299, T337, S373, S412, S431, 
S438, and S555. SIGP-7 shares 16% identity with canine pinin (GI 1684845). The fragment 
of SEQ ID NO: 84 from about nucleotide 181 to about nucleotide 219 is useful for 
hybridization. Northern analysis shows the expression of this sequence in reproductive, 
gastrointestinal, neural, cardiovascular, and developmental cDNA libraries. Approximately 
43% of these libraries are associated with neoplastic disorders, 21% with inflammation, and 
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20% with cell proliferation. 

Nucleic acids encoding the SIGP-8 of the present invention were first identified in 
Incyte Clone 1273453 from the testicle cDNA library (TESTTUT02) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:85, was 
5 derived from Incyte Clones 1273453 (TESTTUT02), 1970337 (UCMCL5T01), 1218926 
(NEUTGMT01) , 1881349 (LEUKNOT03), and 1722377 (BLADNT06). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:8. SIGP-8 is 267 amino acids in length and has a potential N 
glycosylation site at N230, five potential casein kinase II phosphorylation sites at S9, T45, 
10 T77, S190, and T263, and two potential protein kinase C phosphorylation sites at S232 and 
S236. The fragment of SEQ ID NO: 85 from about nucleotide 140 to about nucleotide 175 
3 is useful for hybridization. Northern analysis shows the expression of this sequence in 
n reproductive, cardiovascular, and hematopoietic and immune cDNA libraries. 
r i Approximately 42% of these libraries are associated with neoplastic disorders and 40% 
K| 5 with immune response. 

I ? ' Nucleic acids encoding the SIGP-9 of the present invention were first identified in 

K Incyte Clone 1275261 from the testicle cDNA library (TESTTUT02) using a computer 
W search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:86, was 
3 derived from Incyte Clones 1275261 (TESTTUT02), 775078 (COLNNOT05), 514772 
20 (MMLR1DT01), and 3224071 (COLNNON03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:9. SIGP-9 is 285 amino acids in length and has a potential 
amidation site at S260, three potential N glycosylation sites at N85, N100 and N156, a 
potential cAMP- and cGMP-dependent protein kinase phosphorylation site at T168, three 
25 potential casein kinase II phosphorylation sites at T168, T215, and S230, three potential 
protein kinase C phosphorylation sites at S163, S230, and S260, and a potential tyrosine 
kinase phosphorylation site at Y72. SIGP-9 shares 24% identity with rat OX-45 antigen 
preprotein (GI 56805). The fragment of SEQ ID NO:86 from about nucleotide 243 to 
about nucleotide 293 is useful for hybridization. Northern analysis shows the expression of 
30 this sequence in reproductive, gastrointestinal, and hematopoietic and immune cDNA 
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libraries. Approximately 50% of these libraries are associated with neoplastic disorders 
and 50% with immune response. 

Nucleic acids encoding the SIGP-10 of the present invention were first identified in 
Incyte Clone 1281682 from the colon cDNA library (COLNNOT16) using a computer 
5 search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:87, was 
derived from Incyte Clones 2681940 (SINIUCT01), 1335652 (COLNNOT13), 2079572 
(UTRSNOT08), 627405 (PGANNOT01) and 1281682 and 1282887 (COLNNOT16). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 10. SIGP-10 comprises a peptide of 76 amino acids in 
10 length, and has a potential signal peptide sequence from Ml to S18. The fragment of SEQ 
ID NO: 87 encoding the potential signal peptide sequence from about nucleotide 908 
through 970 is useful for hybridization. Northern analysis shows the expression of this 
sequence in gastrointestinal, neural, reproductive, and hematopoietic and immune cDNA 
libraries. Approximately 32% of these libraries are associated with neoplastic disorders 
15 and 53% with immune response. 

Nucleic acids encoding the SIGP-11 of the present invention were first identified in 
Incyte Clone 1298305 from the breast cDNA library (BRSTNOT09) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 88, was 
derived from Incyte Clones 1298305 (BRSTNOT09), 3451203 (UTRSNON03), 2529672 
20 (GBLAN0502), 2780863 (OVARTUT03), 927988 (BRAINOT04), 1684424 

(PROSNOT15), 2243053 (PANCTUT02), and shotgun sequences SANA03310 and 
SANA00700. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 11. SIGP-11 is 147 amino acids in length and has a 

25 prokaryotic membrane lipoprotein lipid attachment site from L34 through C44. SIGP-11 
also has a potential cAMP- and cGMP-dependent protein kinase phosphorylation site at 
S91, and a potential protein kinase C phosphorylation site at S13. The fragment of SEQ 
ID NO: 88 from about nucleotide 1561 to about nucleotide 1611 is useful for hybridization. 
Northern analysis shows the expression of this sequence in reproductive, gastrointestinal, 

30 and neural cDNA libraries. Approximately 50% of these libraries are associated with 
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neoplastic disorders and 22% with immune response. 

Nucleic acids encoding the SIGP-12 of the present invention were first identified in 
Incyte Clone 1360501 from the lung cDNA library (LUNGNOT12) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:89, was 
derived from Incyte Clones 1360501 (LUNGNOT12), 2121661 (BRSTNOT07), 1706518 
(DUODNOT02) and shotgun sequences SAJA02519, SAJA00749, SAJA01160, and 
SANA00513. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 12. SIGP-12 is 261 amino acids in length and has six 
potential N glycosylation sites at N19, N28, N98, N104, N164 and N178. SIGP-12 also 
has five potential casein kinase II phosphorylation sites at T82, S83, T91, T160, and S233, 
and nine potential protein kinase C phosphorylation sites at T35, T60, T82, S121, S131, 
T184, S233, S237, and T242. SIGP-12 shares 22% identity with Trypanosoma cruzi 
mucin-like protein (GI 1019433). In addition, SIGP-12 shares two potential 
phosphorylation sites and a potential N-glycosylation site with the mucin-like protein. The 
fragment of SEQ ID NO: 89 from about nucleotide 183 to about nucleotide 236 is useful for 
hybridization. Northern analysis shows the expression of this sequence in reproductive, 
cardiovascular, and gastrointestinal cDNA libraries. Approximately 39% of these libraries 
are associated with neoplastic disorders and 26% with immune response. 

Nucleic acids encoding the SIGP-13 of the present invention were first identified in 
Incyte Clone 1362406 from the lung cDNA library (LUNGNOT12) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:90, was 
derived from Incyte Clones 1362406 (LUNGNOT12), 1854401 (HNT3AZT01), 1570003 
(UTRSNOT05) and shotgun sequences SANA03704, SANA00366, and SANA02152. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 13. SIGP-13 is 213 amino acids in length and has three 
potential protein kinase C phosphorylation sites at T40, SI 36, andT166. In addition, 
SIGP-13 has a highly hydrophobic signal peptide sequence from residue Ml to E34. SIGP- 
13 shares 20% identity with a Mycobacterium tuberculosis membrane protein (GI 
2072705). The fragment of SEQ ID NO:90 encoding the potential signal peptide sequence 
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domain from about nucleotide 157 to about nucleotide 219 is useful for hybridization. 
Northern analysis shows the expression of this sequence in reproductive, developmental, 
neural, and cardiovascular cDNA libraries. Approximately 50% of these libraries are 
associated with neoplastic disorders and 18% with immune response. 
5 Nucleic acids encoding the SIGP-14 of the present invention were first identified in 

Incyte Clone 1405329 from the heart cDNA library (LATRTUT02) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:91, was 
derived from Incyte Clones 1405329 (LATRTUT02), and 2830813 (TLYMNOT03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
10 acid sequence of SEQ ID NO: 14. SIGP-14 is 67 amino acids in length and has a cell 
attachment sequence comprising R13 through D15. In addition, SIGP-14 has a potential 
^0 casein kinase II phosphorylation site at T 12, and a potential protein kinase C 
B phosphorylation site at T42. The fragment of SEQ ID NO:91 from about nucleotide 36 to 
11 Jf about nucleotide 95 is useful for hybridization. Northern analysis shows the expression of 
f|l5 this sequence in cardiovascular, developmental, reproductive, and hematopoietic and 
s immune cDNA libraries. Approximately 43% of these libraries are associated with 
?u neoplastic disorders and 21 % with immune response. 

W Nucleic acids encoding the SIGP-15 of the present invention were first identified in 

Jl Incyte Clone 1415223 from the brain cDNA library (BRAINOT12) using a computer 
- 20 search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:92, was 
derived from Incyte Clones 1415223 (BRAINOT12) and 529786 (BRAINOT03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 15. SIGP-15 is 161 amino acids in length and has a potential 
N-glycosylation site at N57, two potential casein kinase II phosphorylation sites at S84 and 
25 S96, and five potential protein kinase C phosphorylation sites at Sll, T62, S75, S83, and 
S84. SIGP-15 shares 30% identity with rat Ly6C antigen (GI 205250). The fragment of 
SEQ ID NO: 92 from about nucleotide 28 to about nucleotide 81 is useful for hybridization. 
Northern analysis shows the expression of this sequence in developmental, reproductive, 
and neural cDNA libraries. Approximately 33% of these libraries are associated with 
30 neoplastic disorders, 33% with cell proliferation, and 17% with immune response. 
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Nucleic acids encoding the SIGP-16 of the present invention were first identified in 
Incyte Clone 1416553 from the brain cDNA library (BRAINOT12) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:93, was 
derived from Incyte Clones 1416553 (BRAINOT12), 663124 (BRAINOT03) and shotgun 

5 sequences SANA01409, SANA03513, and SANA02713. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 16. SIGP-16 is 141 amino acids in length and has a 
glycosaminoglycan attachment site at S20. In addition, SIGP-16 has a potential casein 
kinase II phosphorylation site at S61, and a potential protein kinase C phosphorylation site 

10 at S53. The fragment of SEQ ID NO:93 from about nucleotide 784 to about nucleotide 
831 is useful for hybridization. Northern analysis shows the expression of this sequence in 
neural cDNA libraries. Approximately 27% of these libraries are associated with 
neoplastic disorders, and 27% with neurological disorders. 

Nucleic acids encoding the SIGP-17 of the present invention were first identified in 

15 Incyte Clone 1418517 from the kidney cDNA library (KIDNNOT09) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:94, was 
derived from Incyte Clones 1418517 (KIDNNOT09), 2456866 (ENDANOT01), 136927 
(SYNORAB01), 1620442 (BRAITUT13), 1492394 (PROSNON01), 1534435 
(SPLNNOT04), and 2505923 (CONUTUT01). 

20 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO: 17. SIGP-17 is 152 amino acids in length and has a potential 
N glycosylation site at N76; a potential cAMP- and cGMP-dependent protein kinase 
phosphorylation site at T67; four potential casein kinase II phosphorylation sites at S9, 
T30, S107, and S124; and three potential protein kinase C phosphorylation sites at T30, 

25 S34, and T78. The fragment of SEQ ID NO:94 from about nucleotide 49 to about 

nucleotide 99 is useful for hybridization. Northern analysis shows the expression of this 
sequence in reproductive, cardiovascular, musculoskeletal, and gastrointestinal cDNA 
libraries. Approximately 44% of these libraries are associated with neoplastic disorders, 
23% with immune response, and 20% with cell proliferation. 

30 Nucleic acids encoding the SIGP-18 of the present invention were first identified in 
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Incyte Clone 1438165 from the pancreas cDNA library (PANCNOT08) using a computer 
search for amino acid alignments. A consensus sequence, SEQ ID NO:95, was derived from 
Incyte Clones 360389 (SYNORAB01), 485693 (HNT2RAT01), 1233177 (LUNGFET03), 
1255551 (MENITUT03),1438165 (PANCNOT08), 15 54990 (BLADTUT04), and shotgun 
sequences SAOA00854 and SAOA00855. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 1 8. SIGP-1 8 is 742 amino acids in length and has a potential 
N-glycosylation site at N448; a microbodies C-terminal targeting signal in the triplet 
N740HL; twelve potential casein kinase II phosphorylation sites at S3, S53, S120, T122, 
T169, T178, S179, S195, T284, S290, S400, and S573; five potential protein kinase C 
phosphorylation sites at T178, S195, S208, S299, and S364; and two potential tyrosine kinase 
phosphorylation sites at Y296 and Y512. Cysteine residues, representing potential 
intramolecular disulfide bridging sites, are found at residues C87, C204, C3 12, C339, C343, 
C469, C497, C558, C657, C693, and C720. SIGP-18 shares 19% homology with C elegans 
protein encoded by M163.4 (GI 1515161), including eight of the eleven cysteine residues 
found in SIGP-1 8. The fragment of SEQ ID NO:95 from about nucleotide 322 to about 
nucleotide 387 is useful for hybridization. Northern analysis shows the expression of this 
sequence in cardiovascular, male and female reproductive, and gastrointestinal cDNA 
libraries. Approximately 44% of these libraries are associated with neoplastic disorders, 23% 
with inflammation and the immune response, and 19% with fetal development. 

Nucleic acids encoding the SIGP-1 9 of the present invention were first identified in 
Incyte Clone 1440381 from the thyroid cDNA library (THYRNOT03) using a computer 
search for amino acid alignments. A consensus sequence, SEQ ID NO:96, was derived from 
Incyte Clones 989671 (COLNNOT11), 14403 81 (THYRNOT03), 3507668 (CONCNOT01), 
and shotgun sequences SAOA03364, SAOA02692, SAOA00489, SAOA02355, 
SAOA02405, SAOA01209, SAOA00809, and SAOA00274. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 19. SIGP-1 9 is 805 amino acids in length and has three 
potential N-glycosylation sites at N21 1, N215, and N327; one cAMP- and cGMP-dependent 
protein kinase potential phosphorylation sites at T749; sixteen potential casein kinase II 
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phosphorylation sites at S8, T54, T175, T228, S229, S250, S292, S329, T390, S401, S415, 
S471, S492, S671, T780, and S795; ten potential protein kinase C phosphorylation sites at 
S206, T396, S401, S442, T455, S600, S671, T683, S730, and S795; and two potential 
tyrosine kinase phosphorylation sites at Y437 and Y476. SIGP-19 shares 33% homology 
5 with a ubiquitin-conjugating, E2-like enzyme from C elegans (GI 1065459). Both molecules 
share a "UBC domain" characteristic of ubiquitin-conjugating enzymes extending from 
approximately residue V559 to 1647 of SIGP-19, and containing an active site cysteine 
residue, C614, required for thiolester formation. A characteristic proline-rich region, found at 
the N-terminal end of the UBC domain and extending from approximately P564 to P589 in 

10 SIGP-1 9, is also shared by both proteins. The fragment of SEQ ID NO:96 from about 
nucleotide 1678 to about nucleotide 1800 is useful for hybridization. Northern analysis 
shows the expression of this sequence in cardiovascular and male and female reproductive 
cDNA libraries. Approximately 50% of these libraries are associated with neoplastic 
disorders, 14% with inflammation and the immune response, and 19% with fetal 

15 development. 

Nucleic acids encoding the SIGP-20 of the present invention were first identified in 
Incyte Clone 1510839 from the lung cDNA library (LUNGNOT14) using a computer search 
for amino acid alignments. A consensus sequence, SEQ ID NO:97, was derived from Incyte 
Clones 962326 (BRSTTUT03), 1383254 (BRAITUT08), 1510839 (LUNGNOT14), 1970949 
20 (UCMCL5T01), 2214224 (SINTFET03), and shotgun sequences SAOA01059 and 
SAOA02595. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:20. SIGP-20 is 195 amino acids in length and has a potential 
signal peptide sequence between Ml and A3 9. SIGP-20 also has a potential N-glycosylation 

25 site at N83; and three potential casein kinase II phosphorylation sites at T161, T169, and 
T181; and three potential protein kinase C phosphorylation sites at T121, T143, and T153. 
SIGP-20 shares 21% homology with Plasmodium berghei merozoite surface protein- 1 (GI 
2145052). The fragment of SEQ ID NO:97 from about nucleotide 439 to about nucleotide 
502 is useful for hybridization. Northern analysis shows the expression of this sequence in 

30 cardiovascular, male and female reproductive, and developmental cDNA libraries. 
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Approximately 48% of these libraries are associated with neoplastic disorders, 13% with 
inflammation and the immune response, and 1 9% with fetal development. 

Nucleic acids encoding the SIGP-21 of the present invention were first identified in 
Incyte Clone 1534876 from the spleen cDNA library (SPLNNOT04) using a computer search 
5 for amino acid alignments. A consensus sequence, SEQ ID NO:98, was derived from Incyte 
Clones 1253004 (LUNGFET03), 1382838 (BRAITUT08), 1532501 (SPLNNOT04), 
1534876 (SPLNNOT04), 1705806 (DUODNOT02), 1738301 (COLNNOT22), 1926209 
(BRSTNOT02), and shotgun sequences SAOA00587, SAOA02048, and SAOA03535. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 

10 acid sequence of SEQ ID NO:21. SIGP-21 is 161 amino acids in length and has a potential 
signal peptide sequence between Ml and CI 3. SIGP-21 also has 17 cysteine residues with 
the potential for forming intramolecular disulfide bridges. Six of these cysteine residues, 
between residues CI 29 and CI 52, are found in a signature sequence for trypsin/alpha- 
amylase inhibitors that form a structure with intramolecular disulfide bridges. SIGP-21 has 

15 two potential casein kinase II phosphorylation sites at T25 and S3 5; and two potential protein 
kinase C phosphorylation sites at S35 and T87. The fragment of SEQ ID NO:98 from about 
nucleotide 406 to about nucleotide 477, which encompasses the trypsin/alpha-amylase 
inhibitor signature sequence, is useful for hybridization. Northern analysis shows the 
expression of this sequence in gastrointestinal and male and female reproductive cDNA 

20 libraries. Approximately 45% of these libraries are associated with neoplastic disorders and 
28% with inflammation and the immune response. 

Nucleic acids encoding the SIGP-22 of the present invention were first identified in 
Incyte Clone 1559131 from the spleen cDNA library (SPLNNOT04) using a computer search 
for amino acid alignments. A consensus sequence, SEQ ID NO:99, was derived from Incyte 

25 Clones 1559131 (SPLNNOT04), 1671080 (BMARNOT03), 1924001 (BRSTTUT01), and 
shotgun sequences SAPA01073 and SAOA02895. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:22. SIGP-22 is 160 amino acids in length and has cysteine 
residues capable of forming intramolecular disulfide bridges at C40, C47, CI 08, CI 14, CI 29, 

30 C 1 54, and CI 58. SIGP-22 has one potential casein kinase II phosphorylation site at S9 and 
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one potential protein kinase C phosphorylation site at S3 1 . SIGP-22 shares 26% homology 
with C-215 protein from Saccharomyces cerevisiae (GI 496667), including four of the 
cysteine residues found in SIGP-22. The fragment of SEQ ID NO:99 from about nucleotide 
154 to about nucleotide 193 is useful for hybridization. Northern analysis shows the 
5 expression of this sequence in hematopoietic and male and female reproductive cDNA 

libraries. Approximately 33% of these libraries are associated with neoplastic disorders and 
67% with the immune response. 

Nucleic acids encoding the SIGP-23 of the present invention were first identified in 
Incyte Clone 1601473 from the bladder cDNA library (BLADNOT03) using a computer 

10 search for amino acid alignments. A consensus sequence, SEQ ID NO: 100, was derived from 
Incyte Clones 1601473 (BLADNOT03), and shotgun sequences SAOA00407, SAOA02497, 
SAOA02747, and SAOA02958. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:23. SIGP-23 is 76 amino acids in length and has two cysteine 

15 residues with the potential of forming an intramolecular disulfide bridge at C58 and C72. 
SIGP-23 has one potential casein kinase II phosphorylation site at S7 and three potential 
protein kinase C phosphorylation sites at S7, T29, and T46. The fragment of SEQ ID NO: 100 
from about nucleotide 139 to about nucleotide 180 is useful for hybridization. Northern 
analysis shows the expression of this sequence in breast, brain, spleen, thyroid, and bladder 

20 cDNA libraries. Approximately 33% of these libraries are associated with neoplastic 
disorders, 17% with neural disorders, and 17% with immune disorders. 

Nucleic acids encoding the SIGP-24 of the present invention were first identified in 
Incyte Clone 1615809 from the brain tumor cDNA library (BRAITUT12) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 101, was 

25 derived from Incyte Clones 1615809 (BRAITUT12), 924499 (BRAINOT04), 1273065 
(TESTTUT02), 1517058 (PANCTUT01), 1596867 (BRAINOT14), and 1361446 
(LUNGNOT12), and shotgun sequence SAOA02975. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:24. SIGP-24 is 336 amino acids in length and has 13 potential 

30 phosphorylation sites at T27, T72, S74, S76, T99, S104, S109, S140, S178, S210, T281, 
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S326, S39. SIGP-24 also has a potential signal peptide sequence between Ml and Y18. The 
fragment of SEQ ID NO: 1 01 from about nucleotide 1 87 to about nucleotide 247 is useful for 
hybridization. Northern analysis shows the expression of this sequence in cardiovascular, 
gastrointestinal, neural, and reproductive cDNA libraries. Approximately 48% of these 
5 libraries are associated with neoplastic disorders and 21% with immune response. 

Nucleic acids encoding the SIGP-25 of the present invention were first identified in 
Incyte Clone 1634813 from the cecal tissue cDNA library (COLNNOT19) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 102, was 
derived from Incyte Clones 1634813 (COLNNOT19), 2904583 (THYMNOT05), 1634813 

10 (COLNNOT19), and 1310492 (COLNFET02), and shotgun sequence SAPA04436. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:25. SIGP-25 is 150 amino acids in length and has one potential 
N-glycosylation site at N139; and five potential phosphorylation sites at T48, SI 18, SI 26, 
SI 35, and SI 36. SIGP-25 also has a potential signal peptide sequence encompassing residues 

15 M1-A23. SIGP-25 shares 28% identity with mouse beta chemokine, Exodus-2 (GI 2196924). 
The fragment of SEQ ID NO:102 from about nucleotide 175 to about nucleotide 235 is useful 
for hybridization. Northern analysis shows the expression of this sequence in 
gastrointestinal, developmental, hematopoietic, and immunological cDNA libraries. 
Approximately 50% of these libraries are associated with fetal development/cell proliferation 

20 and 25% with immune response. 

Nucleic acids encoding the SIGP-26 of the present invention were first identified in 
Incyte Clone 1638407 from the myometrial tissue cDNA library (UTRSNOT06) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:103, was derived from Incyte Clones 1638407 (UTRSNOT06), 3541410 

25 (SEMVNOT04), 1290413 (BRAINOT11), 1467841 (PANCTUT02), 1306495 
(PLACNOT02), and 1907983 (CONNTUT01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:26. SIGP-26 is 217 amino acids in length and has seven 
potential phosphorylation sites at T214, S68, S148, S189, S30, SI 10, and Y149. SIGP-26 

30 also has a potential signal peptide sequence between Ml and G31 . SIGP-26 shares 18% 
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identity with a mouse proline-rich protein (GI 200547). The fragment of SEQ ID NO: 1 03 
from about nucleotide 146 to about nucleotide 206 is useful for hybridization. Northern 
analysis shows the expression of this sequence in gastrointestinal, hematopoietic, 
immunological, and reproductive cDNA libraries. Approximately 42% of these libraries are 
5 associated with neoplastic disorders and 39% with immune response. 

Nucleic acids encoding the SIGP-27 of the present invention were first identified in 
Incyte Clone 1653 1 12 from the prostate tumor tissue cDNA library (PROSTUT08) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:104, was derived from Incyte Clones 1653112 (PROSTUT08), 3450102 (UTRSNON03), 
10 1969850 (UCMCL5T01), 1880259 (LEUKNOT03), 1504393 (BRAITUT07), and 394029 
(TMLR2DT01). 

3 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:27. SIGP-27 is 504 amino acids in length and has eight 
II potential phosphorylation sites at T338, T13, S38, T56, T132, T490, S33, and T472. SIGP- 
p 15 27 also has one potential leucine zipper pattern between L41 8 and L439. SIGP-27 shares 
! § 1 6% identity with mouse alpha- 1 type-X collagen (GI 49794). The fragment of SEQ ID 
* NO: 104 from about nucleotide 130 to about nucleotide 190 is useful for hybridization, 
y Northern analysis shows the expression of this sequence in cardiovascular, endocrine, 
rj hematopoietic, immunological, neural, and reproductive cDNA libraries. Approximately 
= ^20 55% of these libraries are associated with neoplastic disorders and 22% with immune 
response. 

Nucleic acids encoding the SIGP-28 of the present invention were first identified in 
Incyte Clone 1664634 from the breast tissue cDNA library (BRSTNOT09) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 105, was 
25 derived from Incyte Clones 1664634 (BRSTNOT09) and 571656 (OVARNON01), and 
shotgun sequences SAPA04612, SAPA00377, and SAPA03034. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:28. SIGP-28 is 320 amino acids in length and has two potential 
N-glycosylation sites at N122 and N139; and eight potential phosphorylation sites at T30, 
30 S52, S109, S162, S220, S96, T258, and S280. SIGP-28 also has a potential signal peptide 
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sequence between Ml and A21 . SIGP-28 shares 28% identity with a C. elegans protein 
encoded by F32A7.4 (GI 1890375). The fragment of SEQ ID NO: 105 from about nucleotide 
280 to about nucleotide 340 is useful for hybridization. Northern analysis shows the 
expression of this sequence in cardiovascular, gastrointestinal, hematopoietic, 
5 immunological, neural, and reproductive cDNA libraries. Approximately 38% of these 
libraries are associated with neoplastic disorders and 32% with immune response. 

Nucleic acids encoding the SIGP-29 of the present invention were first identified in 
Incyte Clone 1690990 from the prostatic tumor tissue cDNA library (PROSTUT10) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
10 NO:106, was derived from Incyte Clone 1690990 (PROSTUT10), and shotgun sequences 
SAPA01051, SAPA04063, SAPA01670, SAPA02170, SAPA01946, and SAPA00282. 
: . ' In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:29. SIGP-29 is 1 17 amino acids in length and has one potential 
Kj N-glycosylation site at N96; four potential phosphorylation sites at SI 6, S34, T78, and S62; 
rv,15 and one potential N-myristoylation site at G5. SIGP-29 also has one potential microbodies 
* * C-terminal targeting signal at S 1 1 5 . The fragment of SEQ ID NO : 1 06 from about nucleotide 
?™ 1000 to about nucleotide 1062 is useful for hybridization. Northern analysis shows the 
;=j expression of this sequence in gastrointestinal, reproductive, dermal, musculoskeletal, neural, 
r . and urogenital cDNA libraries. Approximately 77% of these libraries are associated with 
" : ^20 neoplastic disorders and 8% with immune response. 

Nucleic acids encoding the SIGP-30 of the present invention were first identified in 
Incyte Clone 1704050 from the duodenal cDNA library (DUODNOT02) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 107, was 
derived from Incyte Clones 865233 (BRAITUT03), 1359660 (LUNGNOT12), and 1704050 
25 (DUODNOT02) and shotgun sequence SAPA02672. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 30. SIGP-30 is 298 amino acids in length and has one 
potential amidation site at P226; four potential N-glycosylation sites at N98, N187, N236, 
and N277; seven potential casein kinase II phosphorylation sites at T39, S59, T100, T149, 
30 S205, T284, and S286; three potential protein kinase C phosphorylation sites at T52, S58, 
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and S279; a potential signal sequence from Ml to G22; and a potential transmembrane 
spanning region from M230 to A261. SIGP-30 contains two potential immunoglobulin 
superfamily domains, from about F29 to about L131 and from about S138 to about R224. 
SIGP-30 shares 25% identity with the human A33 antigen precursor expressed in normal 
human colonic and small bowel epithelium and in human colon cancers (GI 1814277). In 
addition, the position of the hydrophobic transmembrane domain is conserved between 
these molecules. The cysteine residues at C50, C109, C139, C155, C214, and C254 are 
conserved between these molecules. The fragment of SEQ ID NO: 107 from about 
nucleotide 1150 to about nucleotide 1209 is useful for hybridization. Northern analysis 
shows the expression of this sequence in neural, reproductive, cardiovascular, and 
endocrine cDNA libraries. Approximately 68% of these libraries are associated with 
cancer and 9% with immune response. 

Nucleic acids encoding the SIGP-31 of the present invention were first identified in 
Incyte Clone 1711840 from the prostate cDNA library (PROSNOT16) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 108, was 
derived from Incyte Clones 1711840 (PROSNOT16) and 2550483 (LUNGTUT06) and 
shotgun sequence SAQA03185. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:31. SIGP-31 is 118 amino acids in length and has three 
potential protein kinase C phosphorylation sites at S48, T103, and S109; and a potential 
signal peptide sequence from Ml to A20. SIGP-31 shares 61 % identity with human 
midkine, a retinoic acid-responsive heparin binding factor involved in regulation of growth 
and differentiation (GI 182651). The fragment of SEQ ID NO: 108 from about nucleotide 
511 to about nucleotide 555 is useful for hybridization. Northern analysis shows the 
expression of this sequence in reproductive, gastrointestinal, developmental, neural, and 
cardiovascular cDNA libraries. Approximately 58% of these libraries are associated with 
cancer, 16% with immune response, and 23% with fetal/proliferating cells. 

Nucleic acids encoding the SIGP-32 of the present invention were first identified in 
Incyte Clone 1747327 from the stomach tumor cDNA library (STOMTUT02) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
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NO:109, was derived from Incyte Clones 475228 (MMLR2DT01), 1500771 (SINTBST01), 
1880656 (LEUKNOT03), 1747327 (STOMTUT02), and 2720285 (LUNGTUT10). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 32. SIGP-32 is 248 amino acids in length and has one 
5 potential N-glycosylation site at N56; three potential casein kinase II phosphorylation sites 
at S46, S134, and S140; and one potential protein kinase C phosphorylation site at T217. 
SIGP-32 shares 100% identity with human K12 protein precursor which is expressed in 
breast cancer cells and peripheral blood leukocytes (GI 2062391). Northern analysis shows 
the expression of this sequence in gastrointestinal, reproductive, hematopoietic/immune, 
10 and cardiovascular cDNA libraries. Approximately 59% of these libraries are associated 
with cancer and 35% with immune response. 

Nucleic acids encoding the SIGP-33 of the present invention were first identified in 
Incyte Clone 1750632 from the stomach tumor cDNA library (STOMTUT02) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
15 NO:110, was derived from Incyte Clones 1521 122 (BLADTUT04) and 1750632 
(STOMTUT02) and shotgun sequences SAEA02182 and SAEA1002L 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 33. SIGP-33 is 150 amino acids in length and has one 
potential protein kinase C phosphorylation site at S6. SIGP-33 shares 49% identity with 
20 the C. elegans protein encoded by R151 .6 (GI 459002). The fragment of SEQ ID NO: 1 10 
from about nucleotide 514 to about nucleotide 573 is useful for hybridization. Northern 
analysis shows the expression of this sequence in cardiovascular and gastrointestinal cDNA 
libraries. Approximately 88% of these libraries are associated with cancer and 13% with 
immune response. 

25 Nucleic acids encoding the SIGP-34 of the present invention were first identified in 

Incyte Clone 1812375 from the prostate tumor cDNA library (PROSTUT12) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:lll, was derived from Incyte Clones 775001 (COLNNOT05), 834305 (PROSNOT07), 
1504623 (BRAITUT07), and 1812375 (PROSTUT12) and shotgun sequences SAQA02414, 
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SATA00657, and SATA01478. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 34. SIGP-34 is 431 amino acids in length and has four 
potential N-glycosylation sites at Nil, N49, N73, and N312; one potential cAMP- and 
5 cGMP-dependent protein kinase phosphorylation site at SI 97; six potential casein kinase II 
phosphorylation sites at T38, S79, S130, S165, S177, and T188; three potential protein 
kinase C phosphorylation sites at S184, T254, and S337; and a potential high affinity 
calcium ion-binding, vitamin K-dependent carboxylation domain between W371 and W408. 
The fragments of SEQ ID NO: 111 from about nucleotide 222 to about nucleotide 282 and 

10 the potential carboxylation domain encoded from about nucleotide 1267 to about nucleotide 
1380 are useful for hybridization. Northern analysis shows the expression of this sequence 
in reproductive, neural, gastrointestinal, cardiovascular, and hematopoietic/immune DNA 
libraries. Approximately 52% of these libraries are associated with cancer, 24% with 
immune response, and 20% with fetal/proliferating cells. 

15 Nucleic acids encoding the SIGP-35 of the present invention were first identified in 

Incyte Clone 1818761 from the prostate cDNA library (PROSNOT20) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 112, was 
derived from Incyte Clone 1818761 (PROSNOT20) and shotgun sequences SAJA00040, 
SAJA00601, SAJA01791, and SAJA02873. 

20 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO: 35. SIGP-35 is 278 amino acids in length and has one 
potential N-glycosylation site at N91; three potential casein kinase II phosphorylation sites 
at S9, S125, and S156; two potential protein kinase C phosphorylation sites at S77 and 
S224; one potential tyrosine kinase phosphorylation site at Y258; and a potential signal 

25 sequence from Ml to A30. SIGP-35 has fourteen consecutive collagen repeats (G-X-P or 
G-X-X) from G97 to P138 which could form a triple helical structure. SIGP-35 shares 
28% identity with the human adipocyte complement-related protein precursor (Acrp30) (GI 
2493789). The fragment of SEQ ID NO: 112 from about nucleotide 157 to about 
nucleotide 210 is useful for hybridization. Northern analysis shows the expression of this 

30 sequence in developmental, dermal, gastrointestinal, hematopoietic/immune, neural, and 
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reproductive cDNA libraries. Approximately 29% of these libraries are associated with 
cancer, 43% with immune response, and 29% with fetal development. 

Nucleic acids encoding the SIGP-36 of the present invention were first identified in 
Incyte Clone 1824469 from the gallbladder tumor cDNA library (GBLADTUT01) using a 
5 computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 113, was derived from Incyte Clones 1664262 (BRSTNOT09), 1733422 
(BRSTTUT08), 1824469 (GBLADTUTO 1 ) , 2057044 (BEPINOT01), and 2449822 
(ENDANOT01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
10 acid sequence of SEQ ID NO: 36. SIGP-36 is 286 amino acids in length and has one 

potential N-glycosylation site at N271; four potential casein kinase II phosphorylation sites 
at S50, S192, T230, and T251; and five potential protein kinase C phosphorylation sites at 
g T29, T41, S50, T160, and T273. SIGP-36 shares 24% identity with the Mycobacterium 
! f tuberculosis protein encoded by MTCI237. 14c (GI 2052134). The fragment of SEQ ID 
Wl5 NO: 113 from about nucleotide 415 to about nucleotide 468 is useful for hybridization. 
« Northern analysis shows the expression of this sequence in reproductive, gastrointestinal, 
ir; hematopoietic/immune, and neural cDNA libraries. Approximately 49% of these libraries 

are associated with cancer, 21 % with immune response, and 21 % with fetal/proliferating 
fc3 cells. 

"20 Nucleic acids encoding the SIGP-37 of the present invention were first identified in 

Incyte Clone 1864292 from the diseased prostate cDNA library (PROSNOT19) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:114, was derived from Incyte Clone 1864292 (PROSNOT19) and shotgun sequences 
SARA02195, SARA03070, SARA03675, and SATA02454. 
25 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO: 37. SIGP-37 is 404 amino acids in length and has one 
potential amidation site at V136; one potential cAMP- and cGMP-dependent protein kinase 
phosphorylation site at S66; twenty potential casein kinase II phosphorylation sites at S23, 
T27, T74, SI 10, Sill, SI 18, T122, S143, S145, S205, S207, S218, S219, S220, T252, S254, 
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S328, S330, S385, and T393; and twelve potential protein kinase C phosphorylation sites at 
T27, S76, T81, SI 40, S161, SI 76, S229, T285, S309, S356, S367, and S398. SIGP-37 
shares 18% identity with the S. cerevisiae protein encoded by SRP40, a weak suppressor of 
a mutant of the subunit AC40 of DNA-dependent RNA polymerases I and II (GI 295671). 
5 The fragment of SEQ ID NO: 114 f rom about nucleotide 193 to about nucleotide 222 is 
useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive, cardiovascular, and hematopoietic/immune cDNA libraries. Approximately 
75% of these libraries are associated with cancer and 25% with immune response. 

Nucleic acids encoding the SIGP-38 of the present invention were first identified in 
10 Incyte Clone 1866437 from the human promonocyte cell line cDNA library (THP1NOT01) 
using a computer search for amino acid sequence alignments. A consensus sequence, SEQ 
ID NO:115, was derived from Incyte Clones 817970 (OVARTUT01), 825684 
(PROSNOT06), 1866437 (THP1NOT01), 2190170 (PROSNOT26), and 3137972 
(SMCCNOT02). 

15 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:38. SIGP-38 is 405 amino acids in length and has one 
potential N-glycosylation site at N378; one potential cAMP- and cGMP-phosphorylation 
site at S332; nine potential casein kinase II phosphorylation sites at T34, S51, T77, S107, 
S158, S264, T266, S296, and S332; and one potential protein kinase C phosphorylation 

20 site at S68. The fragment of SEQ ID NO: 1 15 from about nucleotide 85 to about nucleotide 
144 is useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive, hematopoietic/immune, neural, and developmental cDNA libraries. 
Approximately 37% of these libraries are associated with cancer, 33% with immune 
response, and 22% with fetal/proliferating cells. 

25 Nucleic acids encoding the SIGP-39 of the present invention were first identified in 

Incyte Clone 1871375 from the leg skin erythema nodosum cDNA library (SKINBIT01) 
using a computer search for amino acid sequence alignments. A consensus sequence, SEQ 
ID NO: 116, was derived from Incyte Clones 1428052 (SINTBST01), 1871375 
(SKINBIT01), and 3210563 (BLADNOT08). 
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In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:39. SIGP-39 is 177 amino acids in length and has one 
potential casein kinase II phosphorylation site at SI 33; one potential glycosaminoglycan 
attachment site at S28GGG; and four potential protein kinase C phosphorylation sites at 
5 S44, S82, S115, and T148. SIGP-39 contains a signature sequence shared by the binding 
domains of receptors for lymphokines, hematopoietic growth factors and growth hormone- 
related molecules at S52RWSLWS. The fragment of SEQ ID NO: 116 encoding the 
sequence surrounding the receptor binding domain signature from about nucleotide 190 to 
about nucleotide 249 is useful for hybridization. Northern analysis shows the expression of 
10 this sequence in reproductive, cardiovascular, gastrointestinal, and developmental cDNA 
libraries. Approximately 44% of these libraries are associated with cancer and 19% with 
immune response. 

Nucleic acids encoding the SIGP-40 of the present invention were first identified in 
Incyte Clone 1880830 from the leukocyte cDNA library (LEUKNOT03) using a computer 

15 search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 117, was 
derived from Incyte Clones 361577 (PROSNOT01); 2113591 (BRAITUT03); 1880830 
(LEUKNOT03) and shotgun sequences SATA03292 and SATA00377. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:40. SIGP-40 is 197 amino acids in length and has a potential 

20 c AMP- and cGMP-dependent protein kinase phosphorylation site at S 1 2 1 ; and four potential 
protein kinase C phosphorylation sites at T3, S57, T107, and T153. SIGP-40 shares 15% 
identity with the Arabidopsis thaliana zinc-fmger protein Lsdl (GI 1872521). The 
fragment of SEQ ID NO:l 17 from about nucleotide 567 to about nucleotide 621 is useful for 
hybridization. Northern analysis shows the expression of this sequence in neural and 

25 reproductive cDNA libraries. Approximately 49% of these libraries are associated with 
neoplastic disorders, 24% with immune response, and 16% with fetal development. 

Nucleic acids encoding the SIGP-41 of the present invention were first identified in 
Incyte Clone 1905325 from the ovary cDNA library (OVARNOT07) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 118, was 
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derived from Incyte Clones 1905325 (OVARNOT07); 621454 (PGANNOT01); 621326 
(PGANNOT01); 1264490 (SYNORAT05); 487357 (HNT2AGT01); 773311 (COLNCRT01); 

and shotgun sequence SATA03582. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
5 acid sequence of SEQ ID NO:41. SIGP-41 is 302 amino acids in length and has two 
potential N-glycosylation sites at N80 and N252; three potential casein kinase II 
phosphorylation sites at S46, T58, and SI 43; and four potential protein kinase C 
phosphorylation sites at T58, S62 ? T147, and S300. SIGP-41 shares 27% identity with 
human necdin-related protein (GI 1754971). The fragment of SEQ ID NO: 1 1 8 from about 

10 nucleotide 1701 to about nucleotide 1800 is useful for hybridization. Northern analysis 
shows the expression of this sequence in reproductive, neural, and gastrointestinal cDNA 
libraries. Approximately 51 % of these libraries are associated with neoplastic disorders 
and 20% with immune response, and 18% with fetal development. 

Nucleic acids encoding the SIGP-42 of the present invention were first identified in 

15 Incyte Clone 1919931 from the breast tumor cDNA library (BRSTTUT01) using a 

computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 119, was derived from Incyte Clones 1919931 (BRSTTUT01) and shotgun sequences 
SATA02529, SATA01526 and SATA00892. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 

20 acid sequence of SEQ ID NO:42. SIGP-42 is 164 amino acids in length and has one 
potential casein kinase II phosphorylation site at T68; and two potential protein kinase C 
phosphorylation sites at T81 and S85. SIGP-42 shares 12% identity with human chemokine 
receptor (GI 2104517). The fragment of SEQ ID NO:l 19 from about nucleotide 585 to 
about nucleotide 630 is useful for hybridization. Northern analysis shows the expression of 

25 this sequence in hematopoietic/immune, reproductive, and neural cDNA libraries. 

Approximately 50% of these libraries are associated with neoplastic disorders and 38% 
with immune response. 

Nucleic acids encoding the SIGP-43 of the present invention were first identified in 
Incyte Clone 1969426 from the breast tissue cDNA library (BRSTNOT04) using a 
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computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 120, was derived from Incyte Clones 1969426 (BRSTNOT04), 2373191 
(ADRENOT07), 1225516 (COLNTUT02), 1555912 (BLADTUT04), 1449240 
(PLACNOT02), and shotgun sequences SAZA01457 and SAZA00207. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:43. SIGP-43 is 235 amino acids in length and has one 
potential N-glycosylation site at N146; one potential glycosaminoglycan attachment site at 
S82; and four potential protein kinase C phosphorylation sites at T16, T43, S228, and S231. 
The fragment of SEQ ID NO: 120 from about nucleotide 243 to about nucleotide 282 is 
useful for hybridization. Northern analysis shows the expression of this sequence in 
neural, reproductive, hematopoietic/immune, cardiovascular, gastrointestinal, and muscle 
cDNA libraries. Approximately 46% of these libraries are associated with neoplastic 
disorders and 28% with immune response. 

Nucleic acids encoding the SIGP-44 of the present invention were first identified in 
Incyte Clone 1969948 from the umbilical cord cDNA library (UCMCL5T01) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 121, was derived from Incyte Clones 1969948 (UCMCL5T01) and shotgun sequences 
SATA01513 and SATA00507. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:44. SIGP-44 is 203 amino acids in length and has three 
potential casein kinase II phosphorylation sites at T23, SI 14, and SI 20; one potential protein 
kinase C phosphorylation site at T 105; and one potential tyrosine kinase phosphorylation site 
at Y47. The fragment of SEQ ID NO: 121 from about nucleotide 162 to about nucleotide 216 
is useful for hybridization. Northern analysis shows the expression of this sequence in 
gastrointestinal, hematopoietic/immune, reproductive, and cardiovascular cDNA libraries. 
Approximately 35% of these libraries are associated with neoplastic disorders and 24% 
with immune response. 

Nucleic acids encoding the SIGP-45 of the present invention were first identified in 
Incyte Clone 1988911 from the lung cDNA library (LUNGAST01) using a computer 
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search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 122, was 
derived from Incyte Clones 1988911 (LUNGAST01), 860576 (BRAITUT03), 3188894 
(THYMNON04), 1466606 (PANCTUT02), 1920945 (BRSTTUT01), 1502970 
(BRAITUT07), and shotgun sequence SAZC00040. 
5 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:45. SIGP-45 is 359 amino acids in length and has nine 
potential casein kinase II phosphorylation sites at S34, S47, S115, T120, T141, S157, SI 82, 
S214, and S331; three potential protein kinase C phosphorylation sites at S34, T259, and 
S325; and one potential tyrosine kinase phosphorylation site at Y241. SIGP-45 shares 16% 

10 identity with rat myosin heavy chain (GI 56649). The fragment of SEQ ID NO: 122 from 
about nucleotide 477 to about nucleotide 558 is useful for hybridization. Northern 
analysis shows the expression of this sequence in reproductive, hematopoietic/immune, 
gastrointestinal, and cardiovascular cDNA libraries. Approximately 47% of these libraries 
are associated with neoplastic disorders, 33% with immune response, and 20% with fetal 

15 development. 

Nucleic acids encoding the SIGP-46 of the present invention were first identified in 
Incyte Clone 2061561 from the ovary cDNA library (OVARNOT03) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 123, was 
derived from Incyte Clones 2061561 (OVARNOT03), 2208104 (SINTFET03 ), 2058750 

20 (OVARNOT03), and shotgun sequences SAZA00915, SAZA00150, and SAZA00799. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:46. SIGP-46 is 150 amino acids in length and has two 
potential amidation sites at F57 and W74; one potential cAMP- and cGMP-dependent protein 
kinase phosphorylation site at T62; two potential casein kinase II phosphorylation sites at 

25 Tl 0 1 and Tl 1 0; and two potential protein kinase C phosphorylation sites at T28 and T97. 

The fragment of SEQ ID NO: 123 from about nucleotide 82 to about nucleotide 168 is useful 
for hybridization. Northern analysis shows the expression of this sequence in reproductive, 
neural, gastrointestinal, and cardiovascular cDNA libraries. Approximately 54% of these 
libraries are associated with neoplastic disorders and 22% with immune response. 
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Nucleic acids encoding the SIGP-47 of the present invention were first identified in 
Incyte Clone 2084489 from the pancreas cDNA library (PANCNOT04) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 124, was 
derived from Incyte Clones 2084489 (PANCNOT04) and shotgun sequences SAJA00837, 
SAJA00793, SAJA01402, SAJA01533, and SAJA01490. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:47. SIGP-47 is 402 amino acids in length and has one 
potential N-glycosylation site at N191; seven potential cAMP- and cGMP-dependent protein 
kinase phosphorylation sites at S22, S23, T80, S81, S202, S248, and S382; twenty-two 
potential casein kinase II phosphorylation sites at S8, S35, S56, SI 07, T152, SI 66, SI 70, 
S202, S206, S208, T212, S214, S216, T244, S252, S256, T264, T287, S288, T327, S362, 
S3 87; ten potential protein kinase C phosphorylation sites at SI 6, SI 16, SI 40, Tl 80, SI 93, 
S194, T236, T244, S252, and S387; and one potential tyrosine kinase phosphorylation site at 
Y361. SIGP-47 shares 28% identity with an A. thaliana protein of unknown function (GI 
2262136). The most conserved region, residues 296 to 386 of SIGP-47, shares 70% 
identity with residues 299 to 386 of the A. thaliana protein. In addition, the potential 
amidation site at A3 14 in SIGP-47 is conserved as one potential amidation site at Q317 in 
the A. thaliana protein; and four potential protein kinase C or cAMP- and cGMP dependent 
protein kinase phosphorylation sites at S193, T236, S252 and Y361 in SIGP-47 are 
conserved as potential phosphorylation sites at S165, S219, T247, and Y364 respectively in 
the A. thaliana protein. The fragment of SEQ ID NO: 124 from about nucleotide 468 to 
about nucleotide 531 is useful for hybridization. Northern analysis shows the expression of 
this sequence in neural, gastrointestinal and cardiovascular cDNA libraries. 
Approximately 50% of these libraries are associated with neoplastic disorders and 20% 
with trauma. 

Nucleic acids encoding the SIGP-48 of the present invention were first identified in 
Incyte Clone 2203226 from the fetal spleen cDNA library (SPLNFET02) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 125, was 
derived from Incyte Clones 2203226 (SPLNFET02), 2215960 (SINTFET03), 1291348 
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(BRAINOT11), 1874915 (LEUKNOT02), and 275828 (TESTNOT03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:48. SIGP-48 is 311 amino acids in length and has one 
potential amidation site at VI 17; one potential casein kinase II phosphorylation site at T215; 
5 and three potential protein kinase C phosphorylation sites at T13, S18, and T263. SIGP-48 
shares 32% identity with a human putative Rab5 interacting protein (GI 1911776). The 
fragment of SEQ ID NO: 125 from about nucleotide 747 to about nucleotide 846 is useful for 
hybridization. Northern analysis shows the expression of this sequence in reproductive, 
cardiovascular, neural, and gastrointestinal cDNA libraries. Approximately 44% of these 
10 libraries are associated with neoplastic disorders, 30% with fetal/proliferative cells and 
- tissues, and 23% with immune response. 

0 Nucleic acids encoding the SIGP-49 of the present invention were first identified in 
3 Ineyte Clone 2232884 from the prostate cDNA library (PROSNOT16) using a computer 

1 search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 126, was 
|l5 derived from Ineyte Clones 2232884 (PROSNOT16), 2728528 (OVARTUT05) , 2232884 

(PRO SNOT 16), and shotgun sequences SASA00238 and SASA00455. 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
[f acid sequence of SEQ ID NO:49. SIGP-49 is 316 amino acids in length and has one 
^ potential N-glycosylation site at N 140; five potential casein kinase II phosphorylation sites at 
20 S3, T8, S29, S85, and T198; and two potential protein kinase C phosphorylation sites at T28 
and S60. The fragment of SEQ ID NO: 126 from about nucleotide 180 to about nucleotide 
279 is useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive, urologic, and neural cDNA libraries. Approximately 77% of these libraries 
are associated with neoplastic disorders. 
25 Nucleic acids encoding the SIGP-50 of the present invention were first identified in 

Ineyte Clone 2328134 from the colon cDNA library (COLNNOT11) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 127, was 
derived from Ineyte Clones 2328134 (COLNNOT11), 1870180 (SKINBIT01), 081403 
(SYNORAB01), and 851547 (NGANNOT01). 
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In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 50. SIGP-50 is 346 amino acids in length and has two 
potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at residues S43 
and S217; one potential casein kinase II phosphorylation site at residue T96; and five 
potential protein kinase C phosphorylation sites at residues T2, T15, T39, T247, and S30L 
SIGP-50 shares 33% identity with the human putative rab5-interacting protein (GI 
1911776) and the casein kinase II phosphorylation site at residue T96. The fragment of 
SEQ ID NO: 127 encoding the potential extracellular ligand binding domain from about 
nucleotide 16 to about nucleotide 76 is useful for hybridization. Northern analysis shows 
the expression of this sequence in reproductive, gastrointestinal, cardiovascular, and neural 
cDNA libraries. Approximately 44% of these libraries are associated with cancer, 28% 
are associated with immune response, and 20% with fetal disorders. 

Nucleic acids encoding the SIGP-51 of the present invention were first identified in 
Incyte Clone 2382718 from the pancreatic cDNA library (ISLTNOT01) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 128, was 
derived from Incyte Clones 2382718 (ISLTNOT01), 3472492 (LUNGNOT27), 014756 
(THP1PLB01), 1731885 (BRSTTUT08), 1889866 (BLADTUT07), and 1447744 
(PLACNOT02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 51. SIGP-51 is 299 amino acids in length and has one 
potential N-glycosylation site at residue N185; one cAMP- and cGMP-dependent protein 
kinase phosphorylation site at T273; nine potential casein kinase II phosphorylation sites at 
S34, S82, T100, SI 18, T152, S154, T193, S203, and S287; eight potential protein kinase 
C phosphorylation sites at S57, T69, T95, S179, T269, S274, S275, and S284; and a 
potential signal peptide sequence from Ml to G27. SIGP-51 shares 26% identity with a 
human antigen precursor protein (GI 1814277); the protein kinase C phosphorylation sites 
at residues S57 and T69; and the casein kinase II phosphorylation site at residue T100. 
The fragment of SEQ ID NO: 128 encoding the potential extracellular ligand binding 
domain from about nucleotide 88 to about nucleotide 148 is useful for hybridization. 
Northern analysis shows the expression of this sequence in reproductive, gastrointestinal, 
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and cardiovascular cDNA libraries. Approximately 48% of these libraries are associated 
with cancer, 29% are associated with immune response, and 20% with fetal disorders. 

Nucleic acids encoding the SIGP-52 of the present invention were first identified in 
Incyte Clone 2452208 from the cardiovascular cDNA library (ENDANOT01) using a 
5 computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 129, was derived from Incyte Clones 2452280 (ENDANOT01), 1505094 
(BRAITUT07), 1521239 (BLADTUT04), and 1309844 (COLNFET02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 52. SIGP-52 is 351 amino acids in length and has two 
10 potential N-glycosylation sites at N241 and N337; two potential cAMP- and cGMP- 
dependent protein kinase phosphorylation sites at S201 and T318; six potential casein 
Jg kinase II phosphorylation sites at S9, S136, T162, T252, S270, and S302; eight potential 

protein kinase C phosphorylation sites at T25, S34, T37, S64, S87, S112, S141, and S322; 
tU and one potential cell attachment sequence at R280GD. The fragment of SEQ ID NO: 129 
ml 5 encoding the potential extracellular ligand binding domain from about nucleotide 97 to 

about nucleotide 157 is useful for hybridization. Northern analysis shows the expression of 
H this sequence in reproductive, gastrointestinal, cardiovascular, and neural cDNA libraries. 
Id Approximately 33% of these libraries are associated with cancer, 33% are associated with 
^ immune response, and 26% with fetal disorders. 

%i 2 o Nucleic acids encoding the SIGP-53 of the present invention were first identified in 

Incyte Clone 2457825 from the aortic endothelial cell cDNA library (ENDANOT01) using 
a computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 130, was derived from Incyte Clone 2457825 (ENDANOT01) and shotgun sequences 
SASA00641, SASA02817, SASA01973, SASA03121, SASA01350, and SASA00693. 

25 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:53. SIGP-53 is 662 amino acids in length and has three 
potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at S555, S578, 
and S652; ten potential casein kinase II phosphorylation sites at S67, T151, T215, S241, 
S470, S471, S482, S556, T589, and T618; one potential leucine zipper pattern from L572 

30 to L593; four potential protein kinase C phosphorylation sites at T2, T21, S80, and T503; 
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and one potential LIM domain signature site from C402 to L436. SIGP-53 shares 10% 
identity with the CL elegans protein encoded by W04D2.1 (GI 1418625); and the casein 
kinase II phosphorylation site at residue S241. The fragment of SEQ ID NO: 130 encoding 
the potential extracellular ligand binding domain from about nucleotide 88 to about 
5 nucleotide 148 is useful for hybridization. Northern analysis shows the expression of this 
sequence in hematopoietic, gastrointestinal, reproductive, and cardiovascular cDNA 
libraries. Approximately 43% of these libraries are associated with cancer, 35% are 
associated with immune response, and 22% with fetal disorders. 

Nucleic acids encoding the SIGP-54 of the present invention were first identified in 
10 Incyte Clone 2470740 from the hematopoietic cDNA library (THP1NOT03) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
j NO: 131, was derived from Incyte Clone 2470740 (THP1NOT03). 

% In one embodiment, the invention encompasses a polypeptide comprising the amino 

y acid sequence of SEQ ID NO: 54. SIGP-54 is 115 amino acids in length and has one 
@15 potential protein kinase C phosphorylation site at S85; and one potential insulin family 
" signature site from C23 to C37. The fragment of SEQ ID NO: 131 encoding the potential 

f; extracellular ligand binding domain from about nucleotide 151 to about nucleotide 211 is 

y 

y useful for hybridization. Northern analysis shows the expression of this sequence in neural 
n and developmental cDNA libraries. Approximately 33% of these libraries are associated 
^20 with cancer and 33% are associated with fetal disorders. 

Nucleic acids encoding the SIGP-55 of the present invention were first identified in 
Incyte Clone 2479092 from the aortic endothelial cell cDNA library (SMCANOT01) using 
a computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 132, was derived from Incyte Clone 2479092 (SMCANOT01) and 1981954 
25 (LUNGTUT03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:55. SIGP-55 is 157 amino acids in length and has one 
potential casein kinase II phosphorylation site at S31; one potential tyrosine kinase 
phosphorylation site at K150; and a potential signal peptide sequence from Ml to A26. 
30 The fragment of SEQ ID NO: 132 encoding the potential extracellular ligand binding 
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domain from about nucleotide 97 to about nucleotide 157 is useful for hybridization. 
Northern analysis shows the expression of this sequence in reproductive, gastrointestinal, 
hematopoietic, and urologic cDNA libraries. Approximately 47% of these libraries are 
associated with cancer and 29% with immune response. 
5 Nucleic acids encoding the SIGP-56 of the present invention were first identified in 

Incyte Clone 2480544 from the aortic smooth muscle cell cDNA library (SMCANOT01) 
using a computer search for amino acid sequence alignments. A consensus sequence, SEQ 
ID NO:133, was derived from Incyte Clones 2480544 (SMCANOT01), 2472409 
(THP1NOT03), 1516031 (PANCTUT01), 855817 (NGANNOT01), 1865287 
10 (PROSNOT19), and 677835 (CRBLNOT01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
^ acid sequence of SEQ ID NO:56. SIGP-56 is 197 amino acids in length and has one potential 
O N glycosylation site at N38; one potential casein kinase II phosphorylation site at SI 23; two 
fl! potential protein kinase C phosphorylation sites at T71 and S82; and a potential signal 
?q\5 peptide sequence from Ml to A27. SIGP-56 shares 15% identity with a Phaseolus vulgaris 
y 1 protein involved in the stress response (GI 1 69345) and shows conservation of proline and 
h k tyrosine residues in the C-terminal region. The fragment of SEQ ID NO: 133 from about 
id nucleotide 125 to about nucleotide 160 is useful for hybridization. Northern analysis shows 

the expression of this sequence in neural, reproductive, and cardiovascular cDNA libraries. 
% t20 Approximately 49% of these libraries are associated with neoplastic disorders and 14% with 
immune response. 

Nucleic acids encoding the SIGP-57 of the present invention were first identified in 
Incyte Clone 2518547 from the brain tumor cDNA library (BRAITUT21) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 134, was 
25 derived from Incyte Clones 2518547 (BRAITUT21), 1509622 (LUNGNOT14), 1562945 
(SPLNNOT04), 1640136 (UTRSNOT06), and 1432014 (BEPINON01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:57. SIGP-57 is 245 amino acids in length and has one potential 
casein kinase II phosphorylation site at S27; and two potential protein kinase C 
30 phosphorylation sites at S5 and T229. SIGP-57 shares 36% identity with a human protein 
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that binds a regulatory element of the c-myc gene (GI 33969). In addition, the potential 
protein kinase C phosphorylation site at T229 is conserved as a potential protein kinase A 
phosphorylation site at SI 76 in the human protein. The fragment of SEQ ID NO: 134 from 
about nucleotide 742 to about nucleotide 775 is useful for hybridization. Northern analysis 
5 shows the expression of this sequence in hematopoietic, reproductive, and neural cDNA 
libraries. Approximately 50% of these libraries are associated with neoplastic disorders and 
28% with immune response. 

Nucleic acids encoding the SIGP-58 of the present invention were first identified in 
Incyte Clone 2530650 from the gallbladder cDNA library (GBLANOT02) using a computer 

10 search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 135, was 
derived from Incyte Clones 2530650 (GBLANOT02), 2617724 (GBLANOT01), 3105644 
(BRSTTUT15), 2903466 (DRGCNOT01), 1545010 (PROSTUT04), 2313837 
(NGANNOT01), 1804413 (SINTNOT13), 3207379 (PENCNOT03), 2347051 
(TESTTUT02), 2602493 (UTRSNOT10), 1259341 (MENITUT03), and 81943 

15 (SYNORAB01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:58. SIGP-58 is 3 10 amino acids in length and has one potential 
N glycosylation site at N206; one potential cAMP- and cGMP-dependent protein kinase 
phosphorylation site at T97; five potential casein kinase II phosphorylation sites at S62, 

20 S156, S214, S222, and T274; five potential protein kinase C phosphorylation sites at T150, 
T167, T208, T265, and S273; one potential tyrosine kinase phosphorylation site at Y96; one 
thyroglobulin type-1 repeat signature from F109 to G143; and a potential signal peptide 
sequence from Ml to A21. SIGP-58 shares 18% identity with bovine thyroglobulin (GI 
22041 1 1) and 46% identity between F109 and G143, the thyroglobulin type-1 repeat 

25 signature. The fragment of SEQ ID NO: 135 from about nucleotide 92 to about nucleotide 
127 is useful for hybridization. Northern analysis shows the expression of this sequence in 
reproductive and cardiovascular cDNA libraries. Approximately 67% of these libraries are 
associated with neoplastic disorders and 19% with immune response. 

Nucleic acids encoding the SIGP-59 of the present invention were first identified in 

30 Incyte Clone 2652271 from the thymus cDNA library (THYMNOT04) using a computer 
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search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 136, was 
derived from Incyte Clones 2652271 (THYMNOT04), 2742813 (BRSTTUT14), 763431 
(BRAITUT02), 1272403 (TESTTUT02), 1240531 (LUNGNOT03), and 1318448 
(BLADNOT04). 

5 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:59. SIGP-59 is 256 amino acids in length and has three 
potential N glycosylation sites at N76, N106, and N212; three potential casein kinase II 
phosphorylation sites at T46, SI 88, and T204; two potential protein kinase C phosphorylation 
sites at SI 30 and S221; two potential ribonuclease T2 family histidine active sites from W62 
10 to P69 and from Fl 10 to C121 ; and a potential signal peptide sequence from Ml to A24. 

SIGP-59 shares 24% identity with Solarium lycopersicum ribonuclease LE (GI 895855); 80% 
S identity between W62 and P75, one of the two ribonuclease T2 family histidine active sites; 
if and 92% identity between Fl 10 and C 121,. the second of the two ribonuclease T2 family 
U histidine active sites. The fragment of SEQ ID NO: 1 36 from about nucleotide 462 to about 
gI5 nucleotide 494 is useful for hybridization. Northern analysis shows the expression of this 

sequence in reproductive, hematopoietic, and gastrointestinal cDNA libraries. Approximately 
* 53% of these libraries are associated with neoplastic disorders and 28% with immune 
y response. 

I Nucleic acids encoding the SIGP-60 of the present invention were first identified in 

= ^0 Incyte Clone 2746976 from the lung tumor cDNA library (LUNGTUT1 1) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 137, was 
derived from Incyte Clones 2746976 (LUNGTUT1 1), 488049 (HNT2AGT01), 1907738 
(CONNTUT01), 782645 (MYOMNOT01), and 823864 (PROSNOT06). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
25 acid sequence of SEQ ID NO:60. SIGP-60 is 1 60 amino acids in length and has one potential 
cAMP- and cGMP-dependent protein kinase phosphorylation site at T3 1 ; four potential 
casein kinase II phosphorylation sites at S23, S47, S96, and SI 52; four potential protein 
kinase C phosphorylation sites at S23, T125, S126, and T149; and a clathrin adaptor complex 
small chain signature from 156 to F66. SIGP-60 shares 84% identity with mouse clathrin- 
30 associated protein 19 (GI 191983) and 91% identity with the clathrin adaptor complex small 
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chain signature between 156 and F66. In addition, all potential casein kinase II and protein 
kinase C phosphorylation sites are conserved between SIGP-60 and the mouse protein. The 
fragments of SEQ ID NO: 137 from about nucleotide 144 to about nucleotide 170 and from 
about nucleotide 495 to about nucleotide 521 are useful for hybridization. Northern analysis 
5 shows the expression of this sequence in hematopoietic, cardiovascular, and reproductive 
cDNA libraries. Approximately 39% of these libraries are associated with neoplastic 
disorders and 39% with immune response. 

Nucleic acids encoding the SIGP-61 of the present invention were first identified in 
Incyte Clone 2753496 from the THP-1 promonocyte cDNA library (THP1 AZS08) using a 

10 computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 

NO:138, was derived from Incyte Clones 2753496 (THP1AZS08), 2642512 (LUNGTUT08), 
1367244 (SCORNON02), 474458 (MMLR1DT01), 1349777 (LATRTUT02), 1380831 
(BRAITUT08), and 832934 (PROSTUT04). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 

15 acid sequence of SEQ ID NO:6L SIGP-61 is 341 amino acids in length and has one potential 
N glycosylation site at N66; four potential casein kinase II phosphorylation sites at T157, 
T207, S296, and S335; two potential protein kinase C phosphorylation sites at S159 and 
S296; and one potential tyrosine kinase phosphorylation site at Y184. SIGP-61 shares 17% 
identity with Schizosaccharomyces pombe BEM46, a protein involved in cell polarity (GI 

20 987286) and the potential phosphorylation sites at Tl 57 and S296. The fragment of SEQ ID 
NO: 138 from about nucleotide 79 to about nucleotide 1 14 is useful for hybridization. 
Northern analysis shows the expression of this sequence in reproductive, gastrointestinal, and 
neural cDNA libraries. Approximately 52% of these libraries are associated with neoplastic 
disorders and 25% with immune response. 

25 Nucleic acids encoding the SIGP-62 of the present invention were first identified in 

Incyte Clone 2781553 from the ovarian tumor cDNA library (OVARTUT03) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:139, was derived from Incyte Clones 2781553 (OVARTUT03), 1413079 (BRAINOT12), 
894971 (BRSTNOT05), 2696043 (UTRSNOT12), 1267806 (BRAINOT09), 1961608 

30 (BRSTNOT04), 1755817 (LIVRTUT01), 1793882 (PROSTUT05), 1251515 
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(LUNGFET03), 1560984 (SPLNNOT04), and 1872574 (LEUKNOT02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:62. SIGP-62 is 430 amino acids in length and has one potential 
cAMP- and cGMP-dependent protein kinase phosphorylation site at S3 87; thirteen potential 
5 casein kinase II phosphorylation sites at S182, S214, S235, T248, S258, T266, T275, T294, 
S313, T356, S387, T404, and S413; six potential protein kinase C phosphorylation sites at 
T71, S168, S235, S306, T356, and S374; and a mitochondrial energy transfer protein 
signature from PI 14 to LI 22. Northern analysis shows the expression of this sequence in 
reproductive, neural, and hematopoietic cDNA libraries. Approximately 47% of these 

10 libraries are associated with neoplastic disorders and 1 9% with immune response. 

Nucleic acids encoding the SIGP-63 of the present invention were first identified in 
Incyte Clone 2821925 from the adrenal tumor cDNA library (ADRETUT06) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:140, was derived from Incyte Clones 2821925 (ADRETUT06), 933799 (CERVNOT01), 

15 and 136467 (SYNORAB01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:63. SIGP-63 is 143 amino acids in length and has one potential 
cAMP- and cGMP-dependent protein kinase phosphorylation site at SI 09; three potential 
casein kinase II phosphorylation sites at S36, S80, and T84; five potential protein kinase C 

20 phosphorylation sites at T3 1 , T55, T70, S 1 09, and T122; and a potential signal peptide 
sequence from Ml to A21. Northern analysis shows the expression of this sequence in 
reproductive, musculoskeletal and cardiovascular cDNA libraries. Approximately 50% of 
these libraries are associated with neoplastic disorders and 27% with immune response. 

Nucleic acids encoding the SIGP-64 of the present invention were first identified in 

25 Incyte Clone 2879068 from the uterine tumor cDNA library (UTRSTUT05) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO:141, was 
derived from Incyte Clones 2879068 (UTRSTUT05), 2910155 (KIDNTUT15), 488673 
(HNT2AGT01), 1285407 (COLNNOT16), 1415890 (BRAINOT12), 1352662 
(LATRTUT02), 41046 (TBLYNOT01), and 2686554 (LUNGNOT23). 

30 In one embodiment, the invention encompasses a polypeptide comprising the amino 
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acid sequence of SEQ ID NO:64. SIGP-64 is 301 amino acids in length and has two potential 
N glycosylation sites at N20 and N251; five potential casein kinase II phosphorylation sites at 
S8 ? S41, T125, T161, and T163; five potential protein kinase C phosphorylation sites at T40, 
S41, T59, T66, and S181; one potential tyrosine kinase phosphorylation site at Y176; one 
5 potential glycosaminoglycan attachment site at S253; and two putative RNP-1 RNA-binding 
signatures from R70 to F77 and from R155 to Y162. SIGP-64 shares 59% identity with 
human heterogeneous nuclear ribonucleoprotein D (GI 870749); 100% identity between R70 
and F77, one of the two RNP-1 RNA-binding signatures; and 89% identity between R155 and 
Y162, the second of the two RNP-1 RNA-binding signatures. In addition, eight potential 
10 phosphorylation sites are conserved between SIGP-64 and the human ribonucleoprotein. The 
fragments of SEQ ID NO: 141 from about nucleotide 207 to about nucleotide 248 and from 
% rf about nucleotide 726 to about nucleotide 752 are useful for hybridization. Northern analysis 
S shows the expression of this sequence in reproductive, neural, hematopoietic, and 

j gastrointestinal cDNA libraries. Approximately 48% of these libraries are associated with 
Kl5 neoplastic disorders and 24% with immune response. 

v? ^ Nucleic acids encoding the SIGP-65 of the present invention were first identified in 

Incyte Clone 2886757 from the small intestine cDNA library (SINJNOT02) using a computer 

[ r= search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 142, was 

"I derived from Incyte Clones 2886757 (SINJNOT02), 2230747 (PROSNOT16), and 899432 

i>0 (BRSTTUT03). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:65. SIGP-65 is 233 amino acids in length and has two potential 
N-glycosylation sites at N82 and N196; one potential casein kinase II phosphorylation site at 
SI 70; and two potential protein kinase C phosphorylation sites at SI 02 and T134. SIGP-65 
25 shares 22% identity with S. cerevisiae protein encoded by YOL135c (GI 1420026), and the 
potential casein kinase II phosphorylation site at SI 70 is conserved between the two proteins. 
The fragment of SEQ ID NO: 142 from about nucleotide 99 to about nucleotide 137 is useful 
for hybridization. Northern analysis shows the expression of this sequence in reproductive, 
cardiovascular, and gastrointestinal cDNA libraries. Approximately 59% of these libraries 
30 are associated with neoplastic disorders. 
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Nucleic acids encoding the SIGP-66 of the present invention were first identified in 
Incyte Clone 2964329 from the cervical spinal cord cDNA library (SCORNOT04) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO:143, was derived from Incyte Clones 2964329, (SCORNOT04), 1274814 
5 (TESTTUT02), 746049 (BRAITUT01), 1395667 (THYRNOT03), 1362944 
(LUNGNOT12), and 2589 (HMC1NOT01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 66. SIGP-66 is 354 amino acids in length and has one 
potential cAMP- and cGMP-dependent protein kinase phosphorylation site at S346; two 

10 potential casein kinase II phosphorylation sites at S164 and T180; six potential protein 
kinase C phosphorylation sites at S43, S135, S150, S164, S172, and S201; and one 
potential tyrosine kinase phosphorylation site at Y182. SIGP-66 shares 12% identity with 
S. cerevisiae mitochondrial internal membrane carrier protein (GI 311667). In addition, 
one potential protein kinase C site is conserved between these molecules. The fragment of 

15 SEQ ID NO: 143 from about nucleotide 416 to about nucleotide 442 is useful for 

hybridization. Northern analysis shows the expression of this sequence in reproductive, 
neural, hematopoietic/immune, gastrointestinal, and cardiovascular cDNA libraries. 
Approximately 46% of these libraries are associated with neoplastic disorders and 26% 
with immune response. 

20 Nucleic acids encoding the SIGP-67 of the present invention were first identified in 

Incyte Clone 2965248 from the cervical spinal cord cDNA library (SCORNOT04) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 144, was derived from Incyte Clones 2965248 (SCORNOT04), 485746 
(HNT2RAT01), 865684 (BRAITUT03), 1459157 (COLNFET02), 1597772 

25 (BRAINOT14), 531430 (BRAINOT03), 725362 (SYNOOAT01), 1620429 (BRAITUT13), 
and 190305 (SYNORAB01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:67 SIGP-67 is 235 amino acids in length and has seven 
potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at S50, T80, 

30 T98, T126, S135, S136, and T194; three potential casein kinase II phosphorylation sites at 
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S60, T80, and S81; six potential protein kinase C phosphorylation sites at SI 14, T119, 
T137, S142, S146, and S174; and a strathmin 1 family signature from P75 to E84. SIGP- 
67 shares 44% identity with human strathmin homolog SCGlO/neuron-specific growth- 
associated protein in Alzheimer's disease (GI 1478503), and 71% identity between Ml and 
A107. In addition, one potential cAMP- and cGMP-dependent protein kinase 
phosphorylation site, one potential casein kinase II phosphorylation site, the strathmin 1 
family signature, and the hydrophobic transmembrane domains are conserved between 
these molecules. TM1 extends from about L15 to about F25; and TM2, from about G196 
to about P212. The fragments of SEQ ID NO: 144 from about nucleotide 158 to about 
nucleotide 196 and from about nucleotide 614 to about nucleotide 643 are useful for 
hybridization. Northern analysis shows the expression of this sequence in neural, 
reproductive, gastrointestinal, and hematopoietic/immune cDNA libraries. Approximately 
50% of these libraries are associated with neoplastic disorders and 19% with immune 
response. 

Nucleic acids encoding the SIGP-68 of the present invention were first identified in 
Incyte Clone 3000534 from the Th2 T lymphocyte cDNA library (TLYMNOT06) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 145, was derived from Incyte Clones 3000534 (TL YMNOT06) , 1830964 
(THP1AZT01), 1329136 (PANCNOT07), and 2910083 (KIDNTUT15). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:68. SIGP-68 is 221 amino acids in length and has two 
potential casein kinase II phosphorylation sites at T31 and T70; one potential 
glycosaminoglycan attachment site at S62; three potential protein kinase C phosphorylation 
sites at Till, T146, and T199; and an endoplasmic reticulum targeting sequence at 
H218DEL. SIGP-68 shares 61% identity with the human stroma cell-derived secretory 
factor-2 (GI 1741868). In addition, one potential protein kinase C phosphorylation site and 
the hydrophobic transmembrane domains are conserved between these molecules. TM1 
extends from about A10 to about G27; and TM2, from about T31 to about L45. The 
cysteines at C38, C92, C100, and C149 are conserved between both molecules. The 
fragments of SEQ ID NO: 145 from about nucleotide 89 to about nucleotide 118 and from 
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about nucleotide 608 to about nucleotide 643 are useful for hybridization. Northern 
analysis shows the expression of this sequence in hematopoietic/immune, reproductive, 
cardiovascular, and gastrointestinal cDNA libraries. Approximately 41% of these libraries 
are associated with neoplastic disorders and 31% with immune response. 
5 Nucleic acids encoding the SIGP-69 of the present invention were first identified in 

Incyte Clone 3046870 from the coronary artery cDNA library (HEAANOT01) using a 
computer search for amino acid sequence alignments. A consensus sequence, SEQ ID 
NO: 146, was derived from Incyte Clones 3046870 (HEAANOT01), 2719210 
(THYRNOT09), 581291 (SATPFI006), 1961256 (BRSTNOT04), 2226972 
10 (SEMVNOT01), 2023351 (CONNNOT01), 1379008 (LUNGNOT10), and 1943136 
(HIPONOT01). 

S. In one embodiment, the invention encompasses a polypeptide comprising the amino 

f acid sequence of SEQ ID NO: 69. SIGP-69 is 483 amino acids in length and has one 
I; potential N-glycosylation site at N178; ten potential casein kinase II phosphorylation sites 
gl5 at S16, S49, T60, T67, T92, T121, T170, T187, T250, and S431; and nine potential 
^ protein kinase C phosphorylation sites at SI 13, T170, T187, T194, S210, T265, S284, 
™ T355, and S431. Northern analysis shows the expression of this sequence in reproductive, 
f: gastrointestinal, cardiovascular, and neural cDNA libraries. Approximately 49% of these 
« libraries are associated with neoplastic disorders and 24% with immune response. 
-120 Nucleic acids encoding the SIGP-70 of the present invention were first identified in 

Incyte Clone 3057669 from the pons cDNA library (PONSAZT01) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 147, was 
derived from Incyte Clones 3057669 (PONSAZT01), 548211 (BEPINOT01), 3702516 
(PENCNOT07), 3581270 (293TF3T01), 495191 (HNT2NOT01), 2784427 (BRSTNOT13), 
25 1515961 (PANCTUT01), 3552333 (SYNONOT01), 2838668 (DRGLNOT01), 14600680 
(COLNFET02), and 285677 (EOSIHET02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 70. SIGP-70 is 371 amino acids in length and has three 
potential N-glycosylation sites at N70, N125, and N362; eleven potential casein kinase II 
30 phosphorylation sites at T22, S66, S72, S73, S102, T160, T201, T215, T278, T285, and 
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S3 16; seven potential protein kinase C phosphorylation sites at S72, T79, S99, T127, 
S134, S257, and T299; and one protein kinase signature and profile from L188 to F200. 
Northern analysis shows the expression of this sequence in gastrointestinal, reproductive, 
and neural cDNA libraries. Approximately 54% of these libraries are associated with 
5 neoplastic disorders and 14% with immune response. 

Nucleic acids encoding the SIGP-71 of the present invention were first identified in 
Incyte Clone 3088178 from the aorta cDNA library (HEAONOT03) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 148, was 
derived from Incyte Clones 3088178 (HEAONOT03), 589421 (UTRSNOT01), 2059958 
10 (OVARNOT03), 1550631 (PROSNOT06), and 1271480 (TESTTUT02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
t acid sequence of SEQ ID NO:71. SIGP-71 is 402 amino acids in length and has two 
« potential N glycosylation sites at N13 and N366; two potential cAMP- and cGMP- 
Lj dependent protein kinase phosphorylation sites at T50 and S51; five potential casein kinase 
MS II phosphorylation sites at T50, S51, S52, S56, and S246; one potential glycosaminoglycan 
' attachment site at S247; eight potential protein kinase C phosphorylation sites at T45, T46, 
- S224, S240, S259, T279, S338, and S376; one potential tyrosine kinase phosphorylation 
f: site at Y273; and one beta-transducin family Trp-Asp repeat signature from V243 to V257. 
* SIGP-71 shares 22% identity with S. cerevisiae protein encoded by HRE594 (GI 498997; 
; 20 truncated sequence). In addition, one potential N-glycosylation site, and two potential 
casein kinase II phosphorylation sites are conserved between these molecules. The 
fragment of SEQ ID NO: 148 from about nucleotide 725 to about nucleotide 766 is useful 
for hybridization. Northern analysis shows the expression of this sequence in reproductive, 
neural, cardiovascular, and hematopoietic/immune cDNA libraries. Approximately 51% of 
25 these libraries are associated with neoplastic disorders and 23% with immune response. 

Nucleic acids encoding the SIGP-72 of the present invention were first identified in 
Incyte Clone 3094321 from the breast cDNA library (BRSTNOT19) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 149, was 
derived from Incyte Clones 3094321 (BRSTNOT19), 2517422H1 (BRAITUT21), 2101110 
30 (BRAITUT02), 1303603 (PLACNOT02), 2675275 (KIDNNOT19), 1988065 
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(LUNGAST01), 34101 (THP1NOB01), 1815156 (PROSNOT20), 602724 (BRSTTUT01), 
and 1485067 (CORPNOT02). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:72. SIGP-72 is 640 amino acids in length and has four 
5 potential N-glycosylation sites at N295, N513, N568, and N619; two potential cAMP- and 
cGMP-dependent protein kinase phosphorylation sites at S239 and S507; sixteen potential 
casein kinase II phosphorylation sites at S42, T178, T220, S229, S239, T247, S289, S350, 
S372, S446, T463, S492, T580, S592, S604, and S625; nine potential protein kinase C 
phosphorylation sites at T150, T166, T174, S239, T328, S407, T451, S609, and S621; one 

10 potential tyrosine kinase phosphorylation site at Y265; and one cytochrome c family heme- 
binding site signature at C158YECHP. SIGP-72 shares 33% identity with an essential 
yeast ubiquitin-activating enzyme homolog (GI 793879). In addition, one potential N- 
glycosylation site, one potential casein kinase II phosphorylation site, and six potential 
protein kinase C phosphorylation sites are conserved between these molecules. The 

15 fragments of SEQ ID NO: 149 from about nucleotide 382 to about nucleotide 423 and from 
about nucleotide 1087 to about nucleotide 1113 are useful for hybridization. Northern 
analysis shows the expression of this sequence in reproductive, hematopoietic/immune, 
cardiovascular, and gastrointestinal cDNA libraries. Approximately 48% of these libraries 
are associated with neoplastic disorders and 24% with immune response. 

20 Nucleic acids encoding the SIGP-73 of the present invention were first identified in 

Incyte Clone 3115936 from the lung cDNA library (LUNGTUT13) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 150, was 
derived from Incyte Clones 3115936 (LUNGTUT13) 2359411 (LUNGFET05), 2189762 
(PROSNOT26), 1449756 (PLACNOT02), 541212 (LNODNOT02), 079364 

25 (SYNORAB01), 864877 (BRAITUT03), 2697958 (UTRSNOT12), 1818830 
(PROSNOT20), 1966765 (BRSTNOT04), 998279 (KIDNTUT01), 1961616 
(BRSTNOT04), and 1431515 (BEPINON01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 73. SIGP-73 is 237 amino acids in length and has five 

30 potential casein kinase II phosphorylation sites at S43, S47, S72, S131, and T177; and 
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three potential protein kinase C phosphorylation sites at S39, S125, and T202. SIGP-73 
shares 44% identity with t yeast Rerlp protein, which ensures correct localization of 
Secl2p integral membrane protein of the endoplasmic reticulum (GI 517174). In addition, 
the hydrophobic transmembrane domains are conserved among these molecules. TM1 
extends from about A82 to about P126; and TM2, from about A166 to about M203. The 
fragment of SEQ ID NO: 150 from about nucleotide 585 to about nucleotide 623 is useful 
for hybridization. Northern analysis shows the expression of this sequence in reproductive, 
neural, cardiovascular, gastrointestinal, and hematopoietic/ immune cDNA libraries. 
Approximately 48% of these libraries are associated with neoplastic disorders and 24% 
with immune response. 

Nucleic acids encoding the SIGP-74 of the present invention were first identified in 
Incyte Clone 3116522 from the lung cDNA library (LUNGTUT13) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 151, was 
derived from Incyte Clones 3116522 (LUNGTUT13), 2523149 (BRAITUT21), 1513583 
(PANCTUT01), 834017 (PROSNOT07), 1631796 (COLNNOT19), 1502736 
(BRAITUT07), and 78850 (SYNORAB01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:74. SIGP-74 is 432 amino acids in length and has three 
potential casein kinase II phosphorylation sites at S144, S257, and S3 17; three potential 
protein kinase C phosphorylation sites at T68, S231, and T372; and one potential tyrosine 
kinase phosphorylation site at Y240. SIGP-74 shares 28% identity with the human UDP- 
galactose transporter isoform (GI 1669560). In addition, one potential protein kinase C 
phosphorylation site and the hydrophobic transmembrane domains are conserved between 
these molecules. TM4 extends from about Q108 to about G127; TM5, from about S152 to 
about L173; TM6, from about K205 to about K228; TM7, from about T242 to about S257; 
TM8, from about T268 to about S283; TM9, from about A294 to about T328; and TM10, 
from about A338 to about V409. The fragment of SEQ ID NO: 151 from about nucleotide 
710 to about nucleotide 736 is useful for hybridization. Northern analysis shows the 
expression of this sequence in reproductive, gastrointestinal, cardiovascular, 
hematopoietic/immune, and urologic cDNA libraries. Approximately 54% of these 
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libraries are associated with neoplastic disorders and 25% with immune response. 

Nucleic acids encoding the SIGP-75 of the present invention were first identified in 
Incyte Clone 3117184 from the lung cDNA library (LUNGTUT13) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 152, was 

5 derived from Incyte Clones 3117184 (LUNGTUT13), 2494724 (ADRETUT05), and 
1922002 (BRSTTUT01). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO: 75. SIGP-75 is 252 amino acids in length and has one 
potential N-glycosylation site at N93; one potential cAMP- and cGMP-dependent protein 

10 kinase phosphorylation site at S179; one potential casein kinase II phosphorylation site at 
T189; and five potential protein kinase C phosphorylation sites at S95, SI 15, S123, T140, 
and T200. SIGP-75 shares 39% identity with C. elegans protein encoded by W04D2.6 
(GI 1418628). In addition, one potential N-glycosylation site, and three potential protein 
kinase C phosphorylation sites are conserved between the molecules. The fragment of SEQ 

15 ID NO: 152 from about nucleotide 567 to about nucleotide 593 is useful for hybridization. 
Northern analysis shows the expression of this sequence in cardiovascular, gastrointestinal, 
hematopoietic/immune, and reproductive cDNA libraries. Approximately 50% of these 
libraries are associated with neoplastic disorders and 20% with immune response. 

Nucleic acids encoding the SIGP-76 of the present invention were first identified in 

20 Incyte Clone 3 125 1 56 from the lymph node cDNA library (LNODNOT05) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 153, was 
derived from Incyte Clones 3125156 (LNODNOT05), 1417459 (BRAINOT12), 1567861 
(UTRSNOT05), 154233 (THP1PLB02), 872652 (LUNGAST01), 2525803 (BRAITUT21), 
and 1209172 (BRSTNOT02). 

25 In one embodiment, the invention encompasses a polypeptide comprising the amino 

acid sequence of SEQ ID NO:76. SIGP-76 is 523 amino acids in length and has one potential 
N glycosylation sites at N186; nine potential casein kinase II phosphorylation sites at S63, 
T85, S179, S188, T210, S231, T269, T295, and S474; one potential glycosaminoglycan 
attachment site at S335; ten potential protein kinase C phosphorylation sites at T9, SI 59, 

30 SI 72, S179, T246, S263, S283, S416, S447, and S498; two potential tyrosine kinase 
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phosphorylation sites at Y106 and Y170; and one tyrosine specific protein phosphatase active 
site at V33 1 . SIGP-76 shares 21% identity with human T-cell protein tyrosine phosphatase 
(GI 804750), the N 186 glycosylation site, the phosphorylation sites at SI 79, SI 88, T210, 
T246, S263, T295, S416, and Y170; and 50% identity between P324 and F344, the region of 
5 the tyrosine specific protein phosphatase active site. The fragments of SEQ ID NO: 1 53 from 
about nucleotide 64 to about nucleotide 183 and from about nucleotide 1087 to about 
nucleotide 1 1 19 are useful for hybridization. Northern analysis shows the expression of this 
sequence in neural, reproductive, and gastrointestinal cDNA libraries. Approximately 55% of 
these libraries are associated with neoplastic disorders and 22% with immune response. 

10 Nucleic acids encoding the SIGP-77 of the present invention were first identified in 

Incyte Clone 3129120 from the lung tumor cDNA library (LUNGTUT12) using a computer 
search for amino acid sequence alignments. A consensus sequence, SEQ ID NO: 154, was 
derived from Incyte Clones 3129120 (LUNGTUT12), 3744590 (THYMNOT08), 1512939 
(PANCTUT01), 3220539 (COLNNON03), 1435889 (PANCNOT08), 1452745 

15 (PENITUT01), 874548 (LUNGAST01), 1524326 (UCMCL5T01), and 81 1239 
(LUNGNOT04). 

In one embodiment, the invention encompasses a polypeptide comprising the amino 
acid sequence of SEQ ID NO:77. SIGP-77 is 621 amino acids in length and has two potential 
N glycosylation sites at N203 and N517; one potential protein kinase A or G phosphorylation 

20 site at S84; five potential casein kinase II phosphorylation sites at T45, T185, T233, T278, 
and S573; seven potential protein kinase C phosphorylation sites at T45, T95, SI 09, S299, 
T318, S324, and T482; and one potential leucine zipper motif from L332 to L353. SIGP-77 
shares 27% identity and the phosphorylation site at T318 with S. cerevisiae membrane 
protein important for endocytosis (GI 1256890). The fragments of SEQ ID NO: 154 from 

25 about nucleotide 64 to about nucleotide 183 and from about nucleotide 1087 to about 

nucleotide 1 1 19 are useful for hybridization. Northern analysis shows the expression of this 
sequence in reproductive, neural, gastrointestinal, and cardiovascular cDNA libraries. 
Approximately 53% of these libraries are associated with neoplastic disorders and 17% with 
immune response. 

30 The invention also encompasses SIGP variants. A preferred SIGP variant is one which 
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has at least about 80%, more preferably at least about 90%, and most preferably at least about 
95% amino acid sequence identity to the SIGP amino acid sequence, and which contains at 
least one functional or structural characteristic of SIGP. 

The invention also encompasses polynucleotides which encode SIGP. Accordingly, any 
5 nucleic acid sequence which encodes the amino acid sequence of SIGP can be used to 
produce recombinant molecules which express SIGP. In a particular embodiment, the 
invention encompasses a polynucleotide consisting of a nucleic acid sequence selected from 
the group consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, 
SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID 
10 NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO.90, SEQ ID NO:91 , SEQ ID NO:92, 
SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID 
NO:98, SEQ ID NO:99, SEQ ID NO: 100, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID 
NO.103, SEQ IDNO:104, SEQ IDNO:105, SEQ IDNO:106, SEQ IDNO:107, SEQ ID 
NO: 108, SEQ ID NO: 109, SEQ ID NO: 110, SEQ ID NO: 111, SEQ ID NO: 112, SEQ ID 
15 NO:113, SEQ IDNO:114, SEQIDNO:115, SEQ IDNO:116, SEQ IDNO:117, SEQ ID 
NO:118, SEQ ID NO:119, SEQ IDNO:120, SEQ IDNO:121, SEQIDNO:122, SEQ ID 
NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID 
NO:128, SEQ ID NO:129, SEQ ID NO:130, SEQ ID NO:131, SEQ ID NO:132, SEQ ID 
NO:133, SEQ ID NO:134, SEQ IDNO:135, SEQ IDNO:136, SEQ IDNO:137, SEQ ID 
20 NO: 138, SEQ ID NO: 139, SEQ ID NO: 140, SEQ ID NO: 141, SEQ ID NO: 142, SEQ ID 
NO:143, SEQ ID N0.144, SEQ ID NO:145, SEQ IDNO:146, SEQ IDNO:147, SEQ ID 
NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, SEQ ID NO: 151, SEQ ID NO: 152, SEQ ID 
NO:153, and SEQ ID NO:154. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
25 genetic code, a multitude of polynucleotide sequences encoding SIGP, some bearing minimal 
homology to the polynucleotide sequences of any known and naturally occurring gene, may 
be produced. Thus, the invention contemplates each and every possible variation of 
polynucleotide sequence that could be made by selecting combinations based on possible 
codon choices. These combinations are made in accordance with the standard triplet genetic 
30 code as applied to the polynucleotide sequence of naturally occurring SIGP, and all such 

78 



PF-0459 US 

variations are to be considered as being specifically disclosed. 

Although nucleotide sequences which encode SIGP and its variants are preferably 
capable of hybridizing to the nucleotide sequence of the naturally occurring SIGP under 
appropriately selected conditions of stringency, it may be advantageous to produce nucleotide 
5 sequences encoding SIGP or its derivatives possessing a substantially different codon usage. 
Codons may be selected to increase the rate at which expression of the peptide occurs in a 
particular prokaryotic or eukaryotic host in accordance with the frequency with which 
particular codons are utilized by the host. Other reasons for substantially altering the 
nucleotide sequence encoding SIGP and its derivatives without altering the encoded amino 
10 acid sequences include the production of RNA transcripts having more desirable properties, 
such as a greater half-life, than transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of DNA sequences which encode SIGP and 
SIGP derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 
synthetic sequence may be inserted into any of the many available expression vectors and cell 
15 systems using reagents that are well known in the art. Moreover, synthetic chemistry may be 
used to introduce mutations into a sequence encoding SIGP or any fragment thereof. 

Also encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those shown in 
SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID 
20 NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, 
SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID 
NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, 
SEQ ID NO:100, SEQ ID NO:101, SEQ ID NO:102, SEQ ID NO:103, SEQ ID NO:104, 
SEQ ID NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109, 
25 SEQ ID NO: 1 10, SEQ ID NO: 1 1 1 , SEQ ID NO: 1 1 2, SEQ ID NO: 1 1 3, SEQ ID NO: 1 14, 
SEQ ID NO:l 15, SEQ ID NO:l 16, SEQ ID NO: 1 17, SEQ ID NO:l 18, SEQ ID NO:l 19, 
SEQ IDNO:120, SEQ ID NO:121, SEQ IDNO:122, SEQ IDNO:123, SEQ IDNO:124, 
SEQ ID N0.125, SEQ ID NO:126, SEQ ID NO:127, SEQ ID NO:128, SEQ ID NO:129, 
SEQIDNO:130, SEQ IDNO:131, SEQ IDNO:132, SEQ IDNO:133, SEQ IDNO:134, 
30 SEQ ID NO:135, SEQ ID NO:136, SEQ ID NO:137, SEQ ID NO:138, SEQ ID NO:139, 
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SEQ ID NO:140, SEQ ID NO:141, SEQ ID NO:142, SEQ ID NO:143, SEQ ID NO:144, 
SEQ ID NO:145, SEQ ID NO:146, SEQ ID NO:147, SEQ ID NO:148, SEQ ID NO:149, 
SEQ ID NO: 150, SEQ ID NO:151, SEQ ID NO: 152, SEQ ID NO: 153, and SEQ ID NO: 154, 
under various conditions of stringency. (See, e.g., Wahl, G.M. and S.L. Berger (1987) 
5 Methods Enzymol. 152:399-407; and Kimmel, A.R. (1987) Methods Enzymol. 152:507-511.) 
Methods for DNA sequencing are well known and generally available in the art and may 
be used to practice any of the embodiments of the invention. The methods may employ such 
enzymes as the Klenow fragment of DNA polymerase I, Sequenase® (US Biochemical Corp., 
Cleveland, OH), Taq polymerase (Perkin Elmer), thermostable T7 polymerase (Amersham, 
10 Chicago, IL), or combinations of polymerases and proofreading exonucleases such as those 
found in the ELONGASE Amplification System (GlBCO/BRL, Gaithersburg, MD). 
Preferably, the process is automated with machines such as the Hamilton Micro Lab 2200 
(Hamilton, Reno, NV), Peltier Thermal Cycler (PTC200; MJ Research, Watertown, MA) and 
the ABI Catalyst and 373 and 377 DNA Sequencers (Perkin Elmer). 
15 The nucleic acid sequences encoding SIGP may be extended utilizing a partial nucleotide 

sequence and employing various methods known in the art to detect upstream sequences, 
such as promoters and regulatory elements. For example, one method which may be 
employed, restriction-site PCR, uses universal primers to retrieve unknown sequence adjacent 
to a known locus. (See, e.g., Sarkar, G. (1993) PCR Methods Applic. 2:318-322.) In 
20 particular, genomic DNA is first amplified in the presence of a primer complementary to a 
linker sequence within the vector and a primer specific to the region predicted to encode the 
gene. The amplified sequences are then subjected to a second round of PCR with the same 
linker primer and another specific primer internal to the first one. Products of each round of 
PCR are transcribed with an appropriate RNA polymerase and sequenced using reverse 
25 transcriptase. 

Inverse PCR may also be used to amplify or extend sequences using divergent primers 
based on a known region. (See, e.g., Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186.) 
The primers may be designed using commercially available software such as OLIGO 4.06 
Primer Analysis software (National Biosciences Inc., Plymouth, MN) or another appropriate 
30 program to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or 
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more, and to anneal to the target sequence at temperatures of about 68°C to 72°C. The 
method uses several restriction enzymes to generate a suitable fragment in the known region 
of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR 
template. 

5 Another method which may be used is capture PCR, which involves PCR amplification 

of DNA fragments adjacent to a known sequence in human and yeast artificial chromosome 
DNA. (See, e.g., Lagerstrom, M. et al. (1991) PCR Methods Applic. 1:11 1-119.) In this 
method, multiple restriction enzyme digestions and ligations may be used to place an 
engineered double-stranded sequence into, an unknown fragment of the DNA molecule before 
10 performing PCR. Other methods which may be used to retrieve unknown sequences are 
known in the art. (See, e.g., Parker, J.D. et al. (1991) Nucleic Acids Res. 19:3055-3060.) 
Additionally, one may use PCR, nested primers, and PromoterFinder™ libraries to walk 
genomic DNA (Clontech, Palo Alto, CA). This process avoids the need to screen libraries 
and is useful in finding intron/exon junctions. 
15 When screening for full-length cDNAs, it is preferable to use libraries that have been 

size-selected to include larger cDNAs. Also, random-primed libraries are preferable in that 
they will include more sequences which contain the 5* regions of genes. Use of a randomly 
primed library may be especially preferable for situations in which an oligo d(T) library does 
not yield a full-length cDN A. Genomic libraries may be useful for extension of sequence into 
20 5 ? non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to 
analyze the size or confirm the nucleotide sequence of sequencing or PCR products. In 
particular, capillary sequencing may employ flowable polymers for electrophoretic 
separation, four different fluorescent dyes (one for each nucleotide) which are laser activated, 
25 and a charge coupled device camera for detection of the emitted wavelengths. Output/light 
intensity may be converted to electrical signal using appropriate software (e.g., Genotyper™ 
and Sequence Navigator™, Perkin Elmer), and the entire process from loading of samples to 
computer analysis and electronic data display may be computer controlled. Capillary 
electrophoresis is especially preferable for the sequencing of small pieces of DNA which 
30 might be present in limited amounts in a particular sample. 
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In another embodiment of the invention, polynucleotide sequences or fragments thereof 
which encode SIGP may be used in recombinant DNA molecules to direct expression of 
SIGP, or fragments or functional equivalents thereof, in appropriate host cells. Due to the 
inherent degeneracy of the genetic code, other DNA sequences which encode substantially 
5 the same or a functionally equivalent amino acid sequence may be produced, and these 
sequences may be used to clone and express SIGP. 

As will be understood by those of skill in the art, it may be advantageous to produce 
SIGP-encoding nucleotide sequences possessing non-naturally occurring codons. For 
example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to 

10 increase the rate of protein expression or to produce an RNA transcript having desirable 
properties, such as a half-life which is longer than that of a transcript generated from the 
naturally occurring sequence. 

The nucleotide sequences of the present invention can be engineered using methods 
generally known in the art in order to alter SIGP-encoding sequences for a variety of reasons 

15 including, but not limited to, alterations which modify the cloning, processing, and/or 
expression of the gene product. DNA shuffling by random fragmentation and PCR 
reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the 
nucleotide sequences. For example, site-directed mutagenesis may be used to insert new 
restriction sites, alter glycosylation patterns, change codon preference, produce splice 

20 variants, introduce mutations, and so forth. 

In another embodiment of the invention, natural, modified, or recombinant nucleic acid 
sequences encoding SIGP may be ligated to a heterologous sequence to encode a fusion 
protein. For example, to screen peptide libraries for inhibitors of SIGP activity, it may be 
useful to encode a chimeric SIGP protein that can be recognized by a commercially available 

25 antibody. A fusion protein may also be engineered to contain a cleavage site located between 
the SIGP encoding sequence and the heterologous protein sequence, so that SIGP may be 
cleaved and purified away from the heterologous moiety. 

In another embodiment, sequences encoding SIGP may be synthesized, in whole or in 
part, using chemical methods well known in the art. (See, e.g., Caruthers, M.H. et al. (1980) 

30 Nucl. Acids Res. Symp. Ser. 215-223, and Horn, T. et al. (1980) Nucl. Acids Res. Symp. Ser. 



82 



PF-0459 US 

225-232.) Alternatively, the protein itself may be produced using chemical methods to 
synthesize the amino acid sequence of SIGP, or a fragment thereof. For example, peptide 
synthesis can be performed using various solid-phase techniques. (See, e.g., Roberge, J.Y. et 
al. (1995) Science 269:202-204.) Automated synthesis may be achieved using the ABI 431 A 
5 Peptide Synthesizer (Perkin Elmer). 

The newly synthesized peptide may be substantially purified by preparative high 
performance liquid chromatography. (See, e.g, Chiez, R.M. and F.Z. Regnier (1990) 
Methods Enzymol. 1 82:392-42 1 .) The composition of the synthetic peptides may be 
confirmed by amino acid analysis or by sequencing. (See, e.g., Creighton, T. (1983) Proteins. 
10 Structures and Molecular Properties . WH Freeman and Co., New York, NY.) Additionally, 
the amino acid sequence of SIGP, or any part thereof, may be altered during direct synthesis 
and/or combined with sequences from other proteins, or any part thereof, to produce a variant 
polypeptide. 

In order to express a biologically active SIGP, the nucleotide sequences encoding SIGP 
15 or derivatives thereof may be inserted into appropriate expression vector, i.e., a vector which 
contains the necessary elements for the transcription and translation of the inserted coding 
sequence. 

Methods which are well known to those skilled in the art may be used to construct 
expression vectors containing sequences encoding SIGP and appropriate transcriptional and 

20 translational control elements. These methods include in vitro recombinant DNA techniques, 
synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, J. et al. (1989) 
Molecular Cloning. A Laboratory Manual . Cold Spring Harbor Press, Plainview, NY, ch. 4, 
8, and 16-17; and Ausubel, F.M. et al. (1995, and periodic supplements) Current Protocols in 
Molecular Biology . John Wiley & Sons, New York, NY, ch. 9, 13, and 16.) 

25 A variety of expression vector/host systems may be utilized to contain and express 

sequences encoding SIGP. These include, but are not limited to, microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression 
vectors; yeast transformed with yeast expression vectors; insect cell systems infected with 
virus expression vectors (e.g., baculovirus); plant cell systems transformed with virus 

30 expression vectors (e.g., cauliflower mosaic virus (CaMV) or tobacco mosaic virus (TMV)) 
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or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems. 
The invention is not limited by the host cell employed. 

The "control elements" or "regulatory sequences" are those non-translated regions, e.g., 
enhancers, promoters, and 5' and 3* untranslated regions, of the vector and polynucleotide 
sequences encoding SIGP which interact with host cellular proteins to carry out transcription 
and translation. Such elements may vary in their strength and specificity. Depending on the 
vector system and host utilized, any number of suitable transcription and translation elements, 
including constitutive and inducible promoters, may be used. For example, when cloning in 
bacterial systems, inducible promoters, e.g., hybrid lacZ promoter of the Bluescript® 
phagemid (Stratagene, La Jolla, CA) or pSportl™ plasmid (GlBCO/BRL), may be used. The 
baculovirus polyhedrin promoter may be used in insect cells. Promoters or enhancers derived 
from the genomes of plant cells (e.g., heat shock, RUBISCO, and storage protein genes) or 
from plant viruses (e.g., viral promoters or leader sequences) may be cloned into the vector. 
In mammalian cell systems, promoters from mammalian genes or from mammalian viruses 
are preferable. If it is necessary to generate a cell line that contains multiple copies of the 
sequence encoding SIGP, vectors based on SV40 or EBV may be used with an appropriate 
selectable marker. 

In bacterial systems, a number of expression vectors may be selected depending upon the 
use intended for SIGP. For example, when large quantities of SIGP are needed for the 
induction of antibodies, vectors which direct high level expression of fusion proteins that are 
readily purified may be used. Such vectors include, but are not limited to, multifunctional 
E. coli cloning and expression vectors such as Bluescript® (Stratagene), in which the 
sequence encoding SIGP may be ligated into the vector in frame with sequences for the 
amino-terminal Met and the subsequent 7 residues of fl-galactosidase so that a hybrid protein 
is produced, and pIN vectors. (See, e.g., Van Heeke, G. and S.M. Schuster (1989) J. Biol. 
Chem. 264:5503-5509.) pGEX vectors (Pharmacia Biotech, Uppsala, Sweden) may also be 
used to express foreign polypeptides as fusion proteins with glutathione S -transferase (GST). 
In general, such fusion proteins are soluble and can easily be purified from lysed cells by 
adsorption to glutathione-agarose beads followed by elution in the presence of free 
glutathione. Proteins made in such systems may be designed to include heparin, thrombin, or 
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factor XA protease cleavage sites so that the cloned polypeptide of interest can be released 
from the GST moiety at will. 

In the yeast Saccharomvces cerevisiae . a number of vectors containing constitutive or 
inducible promoters, such as alpha factor, alcohol oxidase, and PGH, may be used. (See, 
5 e.g., Ausubel, supra : and Grant et al. (1987) Methods Enzymol. 153:516-544.) 

In cases where plant expression vectors are used, the expression of sequences encoding 
SIGP may be driven by any of a number of promoters. For example, viral promoters such as 
the 35S and 19S promoters of CaMV may be used alone or in combination with the omega 
leader sequence from TMV. (Takamatsu, N. (1987) EMBO J. 6:307-31 1.) Alternatively, 

10 plant promoters such as the small subunit of RUBISCO or heat shock promoters may be used. 
(See, e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. (1984) Science 
224:838-843; and Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105.) These 
constructs can be introduced into plant cells by direct DNA transformation or 
pathogen-mediated transfection. Such techniques are described in a number of generally 

15 available reviews. (See, e.g., Hobbs, S. or Murry, L.E. in McGrawHill Yearbook of Science 
andTechnology (1992) McGraw Hill, New York, NY; pp. 191-196.) 

An insect system may also be used to express SIGP. For example, in one such system, 
Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express 
foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. The sequences 

20 encoding SIGP may be cloned into a non-essential region of the virus, such as the polyhedrin 
gene, and placed under control of the polyhedrin promoter. Successful insertion of sequences 
encoding SIGP will render the polyhedrin gene inactive and produce recombinant virus 
lacking coat protein. The recombinant viruses may then be used to infect, for example, SL 
frugiperda cells or Trichoplusia larvae in which SIGP may be expressed. (See, e.g., 

25 Engelhard, E.K. et al. (1994) Proc. Nat. Acad. Sci. 91 :3224-3227.) 

In mammalian host cells, a number of viral-based expression systems may be utilized. In 
cases where an adenovirus is used as an expression vector, sequences encoding SIGP may be 
ligated into an adenovirus transcription/translation complex consisting of the late promoter 
and tripartite leader sequence. Insertion in a non-essential El or E3 region of the viral 

30 genome may be used to obtain a viable virus which is capable of expressing SIGP in infected 
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host cells. (See, e.g., Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. 81:3655-3659.) In 
addition, transcription enhancers, such as the Rous sarcoma virus (RS V) enhancer, may be 
used to increase expression in mammalian host cells. 

Human artificial chromosomes (HACs) may also be employed to deliver larger 
fragments of DNA than can be contained and expressed in a plasmid. HACs of about 6 kb to 
10 Mb are constructed and delivered via conventional delivery methods (liposomes, 
polycationic amino polymers, or vesicles) for therapeutic purposes. 

Specific initiation signals may also be used to achieve more efficient translation of 
sequences encoding SIGP. Such signals include the ATG initiation codon and adjacent 
sequences. In cases where sequences encoding SIGP and its initiation codon and upstream 
sequences are inserted into the appropriate expression vector, no additional transcriptional or 
translational control signals may be needed. However, in cases where only coding sequence, 
or a fragment thereof, is inserted, exogenous translational control signals including the ATG 
initiation codon should be provided. Furthermore, the initiation codon should be in the 
correct reading frame to ensure translation of the entire insert. Exogenous translational 
elements and initiation codons may be of various origins, both natural and synthetic. The 
efficiency of expression may be enhanced by the inclusion of enhancers appropriate for the 
particular cell system used. (See, e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 
20:125-162.) 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted sequences or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which 
cleaves a "prepro" form of the protein may also be used to facilitate correct insertion, folding, 
and/or function. Different host cells which have specific cellular machinery and 
characteristic mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, 
HEK293, and WI38), are available from the American Type Culture Collection (ATCC, 
Bethesda, MD) and may be chosen to ensure the correct modification and processing of the 
foreign protein. 

For long term, high yield production of recombinant proteins, stable expression is 
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preferred. For example, cell lines capable of stably expressing SIGP can be transformed 
using expression vectors which may contain viral origins of replication and/or endogenous 
expression elements and a selectable marker gene on the same or on a separate vector. 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days 
5 in enriched media before being switched to selective media. The purpose of the selectable 
marker is to confer resistance to selection, and its presence allows growth and recovery of 
cells which successfully express the introduced sequences. Resistant clones of stably 
transformed cells may be proliferated using tissue culture techniques appropriate to the cell 
type. 

10 Any number of selection systems may be used to recover transformed cell lines. These 

include, but are not limited to, the herpes simplex virus thymidine kinase genes and adenine 
phosphoribosyltransferase genes, which can be employed in tk or apr cells, respectively. 
(See, e.g., Wigler, M. et al. (1977) Cell 1 1 :223-232; and Lowy, I. et al. (1980) Cell 22:817- 
823) Also, antimetabolite, antibiotic, or herbicide resistance can be used as the basis for 

15 selection. For example, dhfr confers resistance to methotrexate; npt confers resistance to the 
aminoglycosides neomycin and G-41 8; and ah or pat confer resistance to chlorsulfuron and 
phosphinotricin acetyltransferase, respectively. (See, e.g., Wigler, M. et al. (1980) Proc. 
Natl. Acad. Sci. 77:3567-3570; Colbere-Garapin, F. et al (1981) J. MoL Biol. 150:1-14; and 
Murry, supra .) Additional selectable genes have been described, e.g., trpB, which allows 

20 cells to utilize indole in place of tryptophan, or hisD, which allows cells to utilize histinol in 
place of histidine. (See, e.g., Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. Sci. 
85:8047-8051.) Recently, the use of visible markers has gained popularity with such markers 
as anthocyanins, B glucuronidase and its substrate GUS, luciferase and its substrate luciferin. 
Green fluorescent proteins (GFP) (Clontech, Palo Alto, CA) are also used (See, e.g., Chalfie, 

25 M. et al. (1 994) Science 263 :802-805.) These markers can be used not only to identify 
transformants, but also to quantify the amount of transient or stable protein expression 
attributable to a specific vector system. (See, e.g., Rhodes, CA. et al. (1995) Methods Mol. 
Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of 
30 interest is also present, the presence and expression of the gene may need to be confirmed. 
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For example, if the sequence encoding SIGP is inserted within a marker gene sequence, 
transformed cells containing sequences encoding SIGP can be identified by the absence of 
marker gene function. Alternatively, a marker gene can be placed in tandem with a sequence 
encoding SIGP under the control of a single promoter. Expression of the marker gene in 

5 response to induction or selection usually indicates expression of the tandem gene as well. 
Alternatively, host cells which contain the nucleic acid sequence encoding SIGP and 
express SIGP may be identified by a variety of procedures known to those of skill in the art. 
These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations 
and protein bioassay or immunoassay techniques which include membrane, solution, or chip 

10 based technologies for the detection and/or quantification of nucleic acid or protein 
sequences. 

The presence of polynucleotide sequences encoding SIGP can be detected by DNA-DNA 
or DNA-RNA hybridization or amplification using probes or fragments or fragments of 
polynucleotides encoding SIGP. Nucleic acid amplification based assays involve the use of 

15 oligonucleotides or oligomers based on the sequences encoding SIGP to detect transformants 
containing DNA or RNA encoding SIGP. 

A variety of protocols for detecting and measuring the expression of SIGP, using either 
polyclonal or monoclonal antibodies specific for the protein, are known in the art. Examples 
of such techniques include enzyme-linked immunosorbent assays (ELISAs), 

20 radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS). A two-site, 
monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two 
non-interfering epitopes on SIGP is preferred, but a competitive binding assay may be 
employed. These and other assays are well described in the art. (See, e.g., Hampton, R. et al. 
(1990) Serological Methods, a Laboratory Manual APS Press, St Paul, MN, Section IV; and 

25 Maddox, D.E. et al. (1983) J. Exp. Med. 158:121 1-1216). 

A wide variety of labels and conjugation techniques are known by those skilled in the art 
and may be used in various nucleic acid and amino acid assays. Means for producing labeled 
hybridization or PCR probes for detecting sequences related to polynucleotides encoding 
SIGP include oligolabeling, nick translation, end-labeling, or PCR amplification using a 

30 labeled nucleotide. Alternatively, the sequences encoding SIGP, or any fragments thereof, 
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may be cloned into a vector for the production of an mRNA probe. Such vectors are known 
in the art, are commercially available, and may be used to synthesize RNA probes in vitro by 
addition of an appropriate RNA polymerase such as T7, T3, or SP6 and labeled nucleotides. 
These procedures may be conducted using a variety of commercially available kits, such as 
5 those provided by Pharmacia & Upjohn (Kalamazoo, MI), Promega (Madison, WI), and U.S. 
Biochemical Corp. (Cleveland, OH). Suitable reporter molecules or labels which may be 
used for ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or 
chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the 
like. 

10 Host cells transformed with nucleotide sequences encoding SIGP may be cultured under 

conditions suitable for the expression and recovery of the protein from cell culture. The 
y3 protein produced by a transformed cell may be secreted or contained intracellularly 
f5 depending on the sequence and/or the vector used. As will be understood by those of skill in 
; ;£ the art, expression vectors containing polynucleotides which encode SIGP may be designed 
E§5 to contain signal sequences which direct secretion of SIGP through a prokaryotic or 
3 eukaryotic cell membrane. Other constructions may be used to join sequences encoding 
^ I SIGP to nucleotide sequences encoding a polypeptide domain which will facilitate 
w purification of soluble proteins. Such purification facilitating domains include, but are not 
v5 limited to, metal chelating peptides such as histidine-tryptophan modules that allow 
20 purification on immobilized metals, protein A domains that allow purification on 

immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity 
purification system (Immunex Corp., Seattle, WA). The inclusion of cleavable linker 
sequences, such as those specific for Factor XA or enterokinase (Invitrogen, San Diego, CA), 
between the purification domain and the SIGP encoding sequence may be used to facilitate 
25 purification. One such expression vector provides for expression of a fusion protein 

containing SIGP and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or 
an enterokinase cleavage site. The histidine residues facilitate purification on immobilized 
metal ion affinity chromatography. (IMAC) (See, e.g., Porath, J. et al. (1992) Prot. Exp. 
Purif. 3: 263-281.) The enterokinase cleavage site provides a means for purifying SIGP from 
30 the fusion protein. (See, e.g., Kroll, D.J. et al (1993) DNA Cell BioL 12:441-453.) 



89 




PF-0459 US 

Fragments of SIGP may be produced not only by recombinant production, but also by 
direct peptide synthesis using solid-phase techniques. (See, e.g., Creighton, T.E. (1984) 
Protein: Structures and Molecular Properties, pp. 55-60, W.H. Freeman and Co., New York, 
NY.) Protein synthesis may be performed by manual techniques or by automation. 
5 Automated synthesis may be achieved, for example, using the Applied Biosy stems 43 1 A 
Peptide Synthesizer (Perkin Elmer). Various fragments of SIGP may be synthesized 
separately and then combined to produce the full length molecule. 

THERAPEUTICS 

10 The expression of the human signal peptide-containing proteins of the invention (SIGP) 

is closely associated with cell proliferation. Therefore, in cancers or immune response where 
SIGP is an activator, transcription factor, or enhancer, and is promoting cell proliferation, it is 
desirable to decrease the expression of SIGP. In conditions where SIGP is an inhibitor or 
suppressor and is controlling or decreasing cell proliferation, it is desirable to provide the 

15 protein or to increase the expression of SIGP. 

In one embodiment, where SIGP is an inhibitor, SIGP or a fragment or derivative thereof 
may be administered to a subject to treat or prevent a cancer such as adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, and teratocarcinoma. Such cancers 
include, but are not limited to, cancers of the adrenal gland, bladder, bone, bone marrow, 

20 brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, 
muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, 
thymus, thyroid, and uterus. 

In another embodiment, a pharmaceutical composition comprising purified SIGP may be 
used to treat or prevent a cancer including, but not limited to, those listed above. 

25 In another embodiment, an agonist which is specific for SIGP may be administered to a 

subject to treat or prevent a cancer including, but not limited to, those cancers listed above. 

In another further embodiment, a vector capable of expressing SIGP, or a fragment or a 
derivative thereof, may be administered to a subject to treat or prevent a cancer including, but 
not limited to, those cancers listed above. 

30 In a further embodiment where SIGP is promoting cell proliferation, antagonists which 
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decrease the expression or activity of SIGP may be administered to a subject to treat or 
prevent a cancer such as adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, 
sarcoma, and teratocarcinoma. Such cancers include, but are not limited to, cancers of the 
adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, 
prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In one aspect, 
antibodies which specifically bind SIGP may be used directly as an antagonist or indirectly as 
a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue which 
express SIGP. 

In another embodiment, a vector expressing the complement of the polynucleotide 
encoding SIGP may be administered to a subject to treat or prevent a cancer including, but 
not limited to, those cancers listed above. 

In yet another embodiment where SIGP is promoting leukocyte activity or proliferation, 
antagonists which decrease the activity of SIGP may be administered to a subject to treat or 
prevent an immune response. Such responses include, but are not limited to, disorders such 
as AIDS, Addison's disease, adult respiratory distress syndrome, allergies, anemia, asthma, 
atherosclerosis, bronchitis, cholecystitus, Crohn's disease, ulcerative colitis, atopic dermatitis, 
dermatomyositis, diabetes mellitus, emphysema, atrophic gastritis, glomerulonephritis, gout, 
Graves' disease, hypereosinophilia, irritable bowel syndrome, lupus erythematosus, multiple 
sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, 
osteoporosis, pancreatitis, polymyositis, rheumatoid arthritis, scleroderma, Sjogren's 
syndrome, and autoimmune thyroiditis; complications of cancer, hemodialysis, extracorporeal 
circulation; viral, bacterial, fungal, parasitic, protozoal, and helminthic infections; and 
trauma. In one aspect, antibodies which specifically bind SIGP may be used directly as an 
antagonist or indirectly as a targeting or delivery mechanism for bringing a pharmaceutical 
agent to cells or tissue which express SIGP. 

In another embodiment, a vector expressing the complement of the polynucleotide 
encoding SIGP may be administered to a subject to treat or prevent an immune response 
including, but not limited to, those listed above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, 
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complementary sequences, or vectors of the invention may be administered in combination 
with other appropriate therapeutic agents. Selection of the appropriate agents for use in 
combination therapy may be made by one of ordinary skill in the art, according to 
conventional pharmaceutical principles. The combination of therapeutic agents may act 
synergistically to effect the treatment or prevention of the various disorders described above. 
Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of 
each agent, thus reducing the potential for adverse side effects. 

An antagonist of SIGP may be produced using methods which are generally known in 
the art. In particular, purified SIGP may be used to produce antibodies or to screen libraries 
of pharmaceutical agents to identify those which specifically bind SIGP. Antibodies to SIGP 
may also be generated using methods that are well known in the art. Such antibodies may 
include, but are not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, 
Fab fragments, and fragments produced by a Fab expression library. Neutralizing antibodies 
(i.e., those which inhibit dimer formation) are especially preferred for therapeutic use. 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, 
humans, and others may be immunized by injection with SIGP or with any fragment or 
oligopeptide thereof which has immunogenic properties. Depending on the host species, 
various adjuvants may be used to increase immunological response. Such adjuvants include, 
but are not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active 
substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, 
and dinitrophenol. Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and 
Corynebacterium parvum are especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 
SIGP have an amino acid sequence consisting of at least about 5 amino acids, and, more 
preferably, of at least about 10 amino acids. It is also preferable that these oligopeptides, 
peptides, or fragments are identical to a portion of the amino acid sequence of the natural 
protein and contain the entire amino acid sequence of a small, naturally occurring molecule. 
Short stretches of SIGP amino acids may be fused with those of another protein, such as 
KLH, and antibodies to the chimeric molecule may be produced. 

Monoclonal antibodies to SIGP may be prepared using any technique which provides for 
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the production of antibody molecules by continuous cell lines in culture. These include, but 
are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the 
EBV-hybridoma technique. (See, e.g., Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, 
D. et al. (1985) J. Immunol. Methods 81:31-42; Cote, R.J. et al. (1983) Proc. Natl. Acad. Sci. 
5 80:2026-2030; and Cole, S.P. et al. (1984) Mol. Cell Biol. 62:109-120.) 

In addition, techniques developed for the production of "chimeric antibodies," such as 
the splicing of mouse antibody genes to human antibody genes to obtain a molecule with 
appropriate antigen specificity and biological activity, can be used. (See, e.g., Morrison, S.L. 
et al. (1984) Proc. Natl. Acad. Sci. 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 
10 3 12:604-608; and Takeda, S. et al. (1985) Nature 3 14:452-454.) Alternatively, techniques 

described for the production of single chain antibodies may be adapted, using methods known 
in the art, to produce SIGP-specific single chain antibodies. Antibodies with related 
specificity, but of distinct idiotypic composition, may be generated by chain shuffling from 
random combinatorial immunoglobulin libraries. (See, e.g., Burton D.R. (1991) Proc. Natl. 
15 Acad. Sci. 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 
population or by screening immunoglobulin libraries or panels of highly specific binding 
reagents as disclosed in the literature. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. 
Sci. 86: 3833-3837; and Winter, G. et al. (1991) Nature 349:293-299.) 
20 Antibody fragments which contain specific binding sites for SIGP may also be 

generated. For example, such fragments include, but are not limited to, F(ab')2 fragments 
produced by pepsin digestion of the antibody molecule and Fab fragments generated by 
reducing the disulfide bridges of the F(ab')2 fragments. Alternatively, Fab expression 
libraries may be constructed to allow rapid and easy identification of monoclonal Fab 
25 fragments with the desired specificity. (See, e.g., Huse, W.D. et al. (1989) Science 
246:1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the 
desired specificity. Numerous protocols for competitive binding or immunoradiometric 
assays using either polyclonal or monoclonal antibodies with established specificities are well 
30 known in the art. Such immunoassays typically involve the measurement of complex 
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formation between SIGP and its specific antibody. A two-site, monoclonal-based 
immunoassay utilizing monoclonal antibodies reactive to two non-interfering SIGP epitopes 
is preferred, but a competitive binding assay may also be employed. (Maddox, supra.) 
In another embodiment of the invention, the polynucleotides encoding SIGP, or any 
5 fragment or complement thereof, may be used for therapeutic purposes. In one aspect, the 
complement of the polynucleotide encoding SIGP may be used in situations in which it 
would be desirable to block the transcription of the mRNA. In particular, cells may be 
transformed with sequences complementary to polynucleotides encoding SIGP. Thus, 
complementary molecules or fragments may be used to modulate SIGP activity, or to achieve 
10 regulation of gene function. Such technology is now well known in the art, and sense or 
antisense oligonucleotides or larger fragments can be designed from various locations along 
the coding or control regions of sequences encoding SIGP. 

Expression vectors derived from retroviruses, adenoviruses, or herpes or vaccinia 
viruses, or from various bacterial plasmids, may be used for delivery of nucleotide sequences 
15 to the targeted organ, tissue, or cell population. Methods which are well known to those 

skilled in the art can be used to construct vectors which will express nucleic acid sequences 
complementary to the polynucleotides of the gene encoding SIGP. (See, e.g., Sambrook, 
supra ; and Ausubel, supra .) 

Genes encoding SIGP can be turned off by transforming a cell or tissue with expression 
20 vectors which express high levels of a polynucleotide, or fragment thereof, encoding SIGP. 
Such constructs may be used to introduce untranslatable sense or antisense sequences into a 
cell. Even in the absence of integration into the DNA, such vectors may continue to 
transcribe RNA molecules until they are disabled by endogenous nucleases. Transient 
expression may last for a month or more with a non-replicating vector, and may last even 
25 longer if appropriate replication elements are part of the vector system. 

As mentioned above, modifications of gene expression can be obtained by designing 
complementary sequences or antisense molecules (DNA, RNA, or PNA) to the control, 5\ or 
regulatory regions of the gene encoding SIGP. Oligonucleotides derived from the 
transcription initiation site, e.g., between about positions -10 and +10 from the start site, are 
30 preferred. Similarly, inhibition can be achieved using triple helix base-pairing methodology. 
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Triple helix pairing is useful because it causes inhibition of the ability of the double helix to 
open sufficiently for the binding of polymerases, transcription factors, or regulatory 
molecules. Recent therapeutic advances using triplex DNA have been described in the 
literature. (See, e.g., Gee, J.E. et al. (1994) in Huber, B.E. and B.L Carr, Molecular and 
5 Immunologic Approaches . Futura Publishing Co., Mt Kisco, NY, pp. 163-177.) A 

complementary sequence or antisense molecule may also be designed to block translation of 
mRNA by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific 
cleavage of RNA. The mechanism of ribozyme action involves sequence-specific 
10 hybridization of the ribozyme molecule to complementary target RNA, followed by 

endonucleolytic cleavage. For example, engineered hammerhead motif ribozyme molecules 
may specifically and efficiently catalyze endonucleolytic cleavage of sequences encoding 
SIGP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified 

15 by scanning the target molecule for ribozyme cleavage sites, including the following 

sequences: GUA, GUU, and GUC. Once identified, short RNA sequences of between 15 and 
20 ribonucleotides, corresponding to the region of the target gene containing the cleavage 
site, may be evaluated for secondary structural features which may render the oligonucleotide 
inoperable. The suitability of candidate targets may also be evaluated by testing accessibility 

20 to hybridization with complementary oligonucleotides using ribonuclease protection assays. 
Complementary ribonucleic acid molecules and ribozymes of the invention may be 
prepared by any method known in the art for the synthesis of nucleic acid molecules. These 
include techniques for chemically synthesizing oligonucleotides such as solid phase 
phosphoramidite chemical synthesis. Alternatively, RNA molecules may be generated by 

25 in vitro and in vivo transcription of DNA sequences encoding SIGP. Such DNA sequences 
may be incorporated into a wide variety of vectors with suitable RNA polymerase promoters 
such as T7 or SP6. Alternatively, these cDNA constructs that synthesize complementary 
RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues. 
RNA molecules may be modified to increase intracellular stability and half-life. 

30 Possible modifications include, but are not limited to, the addition of flanking sequences at 
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the 5' and/or 3* ends of the molecule, or the use of phosphorothioate or 2' O-methyl rather 
than phosphodiesterase linkages within the backbone of the molecule. This concept is 
inherent in the production of PNAs and can be extended in all of these molecules by the 
inclusion of nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl- 
, methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, thymine, and 
uridine which are not as easily recognized by endogenous endonucleases. 

Many methods for introducing vectors into cells or tissues are available and equally 
suitable for use in vivo , in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced 
into stem cells taken from the patient and clonally propagated for autologous transplant back 
into that same patient. Delivery by transfection, by liposome injections, or by polycationic 
amino polymers may be achieved using methods which are well known in the art. (See, e.g., 
Goldman, C.K. et al. (1997) Nature Biotechnology 15:462-466.) 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, 
monkeys, and most preferably, humans. 

An additional embodiment of the invention relates to the administration of a 
pharmaceutical or sterile composition, in conjunction with a pharmaceutically acceptable 
carrier, for any of the therapeutic effects discussed above. Such pharmaceutical compositions 
may consist of SIGP, antibodies to SIGP, and mimetics, agonists, antagonists, or inhibitors of 
SIGP. The compositions may be administered alone or in combination with at least one other 
agent, such as a stabilizing compound, which may be administered in any sterile, 
biocompatible pharmaceutical carrier including, but not limited to, saline, buffered saline, 
dextrose, and water. The compositions may be administered to a patient alone, or in 
combination with other agents, drags, or hormones. 

The pharmaceutical compositions utilized in this invention may be administered by any 
number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, 
intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, 
intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain 
suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
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facilitate processing of the active compounds into preparations which can be used 
pharmaceutical^. Further details on techniques for formulation and administration may be 
found in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., 
Easton, PA). 

5 Pharmaceutical compositions for oral administration can be formulated using 

pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as 
tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for 
ingestion by the patient. 

10 Pharmaceutical preparations for oral use can be obtained through combining active 

compounds with solid excipient and processing the resultant mixture of granules (optionally, 
after grinding) to obtain tablets or dragee cores. Suitable auxiliaries can be added, if desired. 
Suitable excipients include carbohydrate or protein fillers, such as sugars, including lactose, 
sucrose, mannitol, and sorbitol; starch from corn, wheat, rice, potato, or other plants; 

15 cellulose, such as methyl cellulose, hydroxypropylmethyl-cellulose, or sodium 

carboxymethylcellulose; gums, including arabic and tragacanth; and proteins, such as gelatin 
and collagen. If desired, disintegrating or solubilizing agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, and alginic acid or a salt thereof, such as sodium 
alginate. 

20 Dragee cores may be used in conjunction with suitable coatings, such as concentrated 

sugar solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, 
polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents 
or solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for 
product identification or to characterize the quantity of active compound, i.e., dosage. 

25 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or 
sorbitol. Push-fit capsules can contain active ingredients mixed with fillers or binders, such 
as lactose or starches, lubricants, such as talc or magnesium stearate, and, optionally, 
stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable 

30 liquids, such as fatty oils, liquid, or liquid polyethylene glycol with or without stabilizers. 
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Pharmaceutical formulations suitable for parenteral administration may be formulated in 
aqueous solutions, preferably in physiologically compatible buffers such as Hanks's solution, 
Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may 
contain substances which increase the viscosity of the suspension, such as sodium 

5 carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active 

compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic 
solvents or vehicles include fatty oils, such as sesame oil, or synthetic fatty acid esters, such 
as ethyl oleate, triglycerides, or liposomes. Non-lipid polycationic amino polymers may also 
be used for delivery. Optionally, the suspension may also contain suitable stabilizers or 

10 agents to increase the solubility of the compounds and allow for the preparation of highly 
concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
The pharmaceutical compositions of the present invention may be manufactured in a 

15 manner that is known in the art, e.g., by means of conventional mixing, dissolving, 
granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping, or 
lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, and 

20 succinic acid. Salts tend to be more soluble in aqueous or other protonic solvents than are the 
corresponding free base forms. In other cases, the preferred preparation may be a lyophilized 
powder which may contain any or all of the following: 1 mM to 50 mM histidine, 0.1% to 2% 
sucrose, and 2% to 7% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer 
prior to use. 

25 After pharmaceutical compositions have been prepared, they can be placed in an 

appropriate container and labeled for treatment of an indicated condition. For administration 
of SIGP, such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions 
wherein the active ingredients are contained in an effective amount to achieve the intended 

30 purpose. The determination of an effective dose is well within the capability of those skilled 
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For any compound, the therapeutically effective dose can be estimated initially either in 
cell culture assays, e.g., of neoplastic cells or in animal models such as mice, rats, rabbits, 
dogs, or pigs. An animal model may also be used to determine the appropriate concentration 

5 range and route of administration. Such information can then be used to determine useful 
doses and routes for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 
SIGP or fragments thereof, antibodies of SIGP, and agonists, antagonists or inhibitors of 
SIGP, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may 

10 be determined by standard pharmaceutical procedures in cell cultures or with experimental 
animals, such as by calculating the ED50 (the dose therapeutically effective in 50% of the 
population) or LD50 (the dose lethal to 50% of the population) statistics. The dose ratio of 
therapeutic to toxic effects is the therapeutic index, and it can be expressed as the 
ED50/LD50 ratio. Pharmaceutical compositions which exhibit large therapeutic indices are 

15 preferred. The data obtained from cell culture assays and animal studies are used to 

formulate a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that includes the ED50 with little or no 
toxicity. The dosage varies within this range depending upon the dosage form employed, the 
sensitivity of the patient, and the route of administration. 

20 The exact dosage will be determined by the practitioner, in light of factors related to the 

subject requiring treatment. Dosage and administration are adjusted to provide sufficient 
levels of the active moiety or to maintain the desired effect. Factors which may be taken into 
account include the severity of the disease state, the general health of the subject, the age, 
weight, and gender of the subject, time and frequency of administration, drug combination(s), 

25 reaction sensitivities, and response to therapy. Long-acting pharmaceutical compositions 
may be administered every 3 to 4 days, every week, or biweekly depending on the half-life 
and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 fig to 100,000 fig, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages 

30 and methods of delivery is provided in the literature and generally available to practitioners in 
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the art. Those skilled in the art will employ different formulations for nucleotides than for 
proteins or their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be 
specific to particular cells, conditions, locations, etc. 

5 DIAGNOSTICS 

In another embodiment, antibodies which specifically bind SIGP may be used for the 
diagnosis of disorders characterized by expression of SIGP, or in assays to monitor patients 
being treated with SIGP or agonists, antagonists, or inhibitors of SIGP. Antibodies useful for 
diagnostic purposes may be prepared in the same manner as described above for therapeutics. 

10 Diagnostic assays for SIGP include methods which utilize the antibody and a label to detect 
SIGP in human body fluids or in extracts of cells or tissues. The antibodies may be used with 
or without modification, and may be labeled by covalent or non-covalent attachment of a 
reporter molecule. A wide variety of reporter molecules, several of which are described 
above, are known in the art and may be used. 

15 A variety of protocols for measuring SIGP, including ELISAs, RIAs, and FACS, are 

known in the art and provide a basis for diagnosing altered or abnormal levels of SIGP 
expression. Normal or standard values for SIGP expression are established by combining 
body fluids or cell extracts taken from normal mammalian subjects, preferably human, with 
antibody to SIGP under conditions suitable for complex formation The amount of standard 

20 complex formation may be quantitated by various methods, preferably by photometric means. 
Quantities of SIGP expressed in subject, control, and disease samples from biopsied tissues 
are compared with the standard values. Deviation between standard and subject values 
establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding SIGP may be 

25 used for diagnostic purposes. The polynucleotides which may be used include 

oligonucleotide sequences, complementary RNA and DNA molecules, and PNAs. The 
polynucleotides may be used to detect and quantitate gene expression in biopsied tissues in 
which expression of SIGP may be correlated with disease. The diagnostic assay may be used 
to determine absence, presence, and excess expression of SIGP, and to monitor regulation of 

30 SIGP levels during therapeutic intervention. 
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In one aspect, hybridization with PCR probes which are capable of detecting 
polynucleotide sequences, including genomic sequences, encoding SIGP or closely related 
molecules may be used to identify nucleic acid sequences which encode SIGP. The 
specificity of the probe, whether it is made from a highly specific region, e.g., the 5' 
5 regulatory region, or from a less specific region, e.g., a conserved motif, and the stringency of 
the hybridization or amplification (maximal, high, intermediate, or low), will determine 
whether the probe identifies only naturally occurring sequences encoding SIGP, alleles, or 
related sequences. 

Probes may also be used for the detection of related sequences, and should preferably 
10 contain at least 50% of the nucleotides from any of the SIGP encoding sequences. The 
O hybridization probes of the subject invention may be DNA or RNA and may be derived from 
£ the sequence of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81 ? SEQ ID 
NO:82, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, 
P SEQ ID NO:88 ? SEQ ID NO:89, SEQ ID NO.90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID 
jjff5 NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, 
[ tl SEQ ID NO:99, SEQ IDNO:100, SEQ IDNO:101, SEQ IDNO:102, SEQ IDNO:103, SEQ 
;~ ID NO:104, SEQ ID NO:105, SEQ ID NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID 
il NO:109, SEQ ID NO:l 10, SEQ ID NO: 1 1 1, SEQ ID NO:l 12, SEQ ID NO: 1 13, SEQ ID 
■ t NO: 1 14, SEQ ID NO: 1 15, SEQ ID NO: 1 1 6, SEQ ID NO: 1 1 7, SEQ ID NO: 1 1 8, SEQ ID 
20 NO:l 19, SEQ ID NO:120, SEQ ID NO:121, SEQ ID NO:122, SEQ ID NO:123, SEQ ID 
NO:124, SEQ ID NO:125, SEQ ID NO:126, SEQ ID NO:127, SEQ ID NO:128, SEQ ID 
NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID 
NO:134, SEQ IDNO:135, SEQ IDNO:136, SEQ IDNO:137, SEQ IDNO:138, SEQ ID 
NO:139 ? SEQ ID NO:140, SEQ ID NO:141, SEQ ID NO:142, SEQ ID NO:143, SEQ ID 
25 NO:144, SEQ ID NO:145, SEQ ID NO:146, SEQ ID NO:147, SEQ ID NO:148, SEQ ID 

NO:149, SEQ ID NO:150, SEQ ID NO:151, SEQ ID NO:152, SEQ ID NO:153, and SEQ ID 
NO:154 ? or from genomic sequences including promoters, enhancers, and introns of the SIGP 
gene. 

Means for producing specific hybridization probes for DNAs encoding SIGP include the 
30 cloning of polynucleotide sequences encoding SIGP or SIGP derivatives into vectors for the 
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production of mRNA probes. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by means of the addition of the 
appropriate RNA polymerases and the appropriate labeled nucleotides. Hybridization probes 
may be labeled by a variety of reporter groups, for example, by radionuclides such as 32 P or 
5 35 S, or by enzymatic labels, such as alkaline phosphatase coupled to the probe via 
avidin/biotin coupling systems, and the like. 

Polynucleotide sequences encoding SIGP may be used for the diagnosis of a disorder 
associated with either increased or decreased expression of SIGP. Examples of such a 
disorder include, but are not limited to, cancers such as adenocarcinoma, leukemia, 
10 lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and cancers of the adrenal gland, 
bladder, bone, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, 
liver, lung, bone marrow, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary 
D glands, skin, spleen, testis, thymus, thyroid, and uterus; neuronal disorders such as akathesia, 
jl Alzheimer's disease, amnesia, amyotrophic lateral sclerosis, bipolar disorder, catatonia, 
^ 15 cerebral neoplasms, dementia, depression, Down's syndrome, tardive dyskinesia, dystonias, 
; epilepsy, Huntington's disease, multiple sclerosis, neurofibromatosis, Parkinson's disease, 

;h paranoid psychoses, schizophrenia, and Tourette's disorder; and immunological disorders 
12 such as AIDS, Addison's disease, adult respiratory distress syndrome, allergies, anemia, 

asthma, atherosclerosis, bronchitis, cholecystitus, Crohn's disease, ulcerative colitis, atopic 
20 dermatitis, dermatomyositis, diabetes mellitus, emphysema, atrophic gastritis, 

glomerulonephritis, gout, Graves' disease, hypereosinophilia, irritable bowel syndrome, lupus 
erythematosus, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, 
osteoarthritis, osteoporosis, pancreatitis, polymyositis, rheumatoid arthritis, scleroderma, 
Sjogren's syndrome, and thyroiditis. The polynucleotide sequences encoding SIGP may be 
25 used in Southern or northern analysis, dot blot, or other membrane-based technologies; in 
PCR technologies; in dipstick, pin, and ELISA assays; and in microarrays utilizing fluids or 
tissues from patients to detect altered SIGP expression. Such qualitative or quantitative 
methods are well known in the art. 

In a particular aspect, the nucleotide sequences encoding SIGP may be useful in assays 
30 that detect the presence of associated disorders, particularly those mentioned above. The 
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nucleotide sequences encoding SIGP may be labeled by standard methods and added to a 
fluid or tissue sample from a patient under conditions suitable for the formation of 
hybridization complexes. After a suitable incubation period, the sample is washed and the 
signal is quantitated and compared with a standard value, If the amount of signal in the 
5 patient sample is significantly altered in comparison to a control sample then the presence of 
altered levels of nucleotide sequences encoding SIGP in the sample indicates the presence of 
the associated disorder. Such assays may also be used to evaluate the efficacy of a particular 
therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment 
of an individual patient. 

10 In order to provide a basis for the diagnosis of a disorder associated with expression of 

SIGP, a normal or standard profile for expression is established. This may be accomplished 
by combining body fluids or cell extracts taken from normal subjects, either animal or 
human, with a sequence, or a fragment thereof, encoding SIGP, under conditions suitable for 
hybridization or amplification. Standard hybridization may be quantified by comparing the 

15 values obtained from normal subjects with values from an experiment in which a known 
amount of a substantially purified polynucleotide is used. Standard values obtained in this 
manner may be compared with values obtained from samples from patients who are 
symptomatic for a disorder. Deviation from standard values is used to establish the presence 
of a disorder. 

20 Once the presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to determine if the level of expression 
in the patient begins to approximate that which is observed in the normal subject. The results 
obtained from successive assays may be used to show the efficacy of treatment over a period 
ranging from several days to months. 

25 With respect to cancer, the presence of a relatively high amount of transcript in biopsied 

tissue from an individual may indicate a predisposition for the development of the disease, or 
may provide a means for detecting the disease prior to the appearance of actual clinical 
symptoms. A more definitive diagnosis of this type may allow health professionals to 
employ preventative measures or aggressive treatment earlier thereby preventing the 

30 development or further progression of the cancer. 
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Additional diagnostic uses for oligonucleotides designed from the sequences encoding 
SIGP may involve the use of PCR. These oligomers may be chemically synthesized, 
generated enzyrnatically, or produced in vitro . Oligomers will preferably contain a fragment 
of a polynucleotide encoding SIGP, or a fragment of a polynucleotide complementary to the 
5 polynucleotide encoding SIGP, and will be employed under optimized conditions for 

identification of a specific gene or condition. Oligomers may also be employed under less 
stringent conditions for detection or quantitation of closely related DNA or RNA sequences. 

Methods which may also be used to quantitate the expression of SIGP include 
radiolabeling or biotinylating nucleotides, coamplification of a control nucleic acid, and 
10 interpolating results from standard curves. (See, e.g., Melby, P.C. et al. (1993) J. Immunol. 
£5 Methods 159:235-244; and Duplaa, C. et al. (1993) Anal. Biochem. 229-236.) The speed of 
~ quantitation of multiple samples may be accelerated by running the assay in an ELISA format 
f L where the oligomer of interest is presented in various dilutions and a spectrophotometric or 
?q colorimetric response gives rapid quantitation. 

"15 In further embodiments, oligonucleotides or longer fragments derived from any of the 

ff; polynucleotide sequences described herein may be used as targets in a microarray. The 
• l : microarray can be used to monitor the expression level of large numbers of genes 
« simultaneously and to identify genetic variants, mutations, and polymorphisms. This 
^ information may be used to determine gene function, to understand the genetic basis of a 
20 disorder, to diagnose a disorder, and to develop and monitor the activities of therapeutic 
agents. 

In one embodiment, the microarray is prepared and used according to methods known in 
the art. (See, e.g., Chee et al. (1995) PCT application W095/1 1995; Lockhart, D. J. et al. 
(1996) Nat. Biotech. 14:1675-1680; and Schena, M. et al. (1996) Proc. Natl. Acad. Sci. 
25 93:10614-10619.) 

The microarray is preferably composed of a large number of unique single-stranded 
nucleic acid sequences, usually either synthetic antisense oligonucleotides or fragments of 
cDNAs. The oligonucleotides are preferably about 6 to 60 nucleotides in length, more 
preferably about 15 to 30 nucleotides in length, and most preferably about 20 to 25 
30 nucleotides in length. It may be preferable to use oligonucleotides which are about 7 to 10 
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nucleotides in length. The microarray may contain oligonucleotides which cover the known 
5' or 3' sequence, sequential oligonucleotides which cover the full length sequence, or unique 
oligonucleotides selected from particular areas along the length of the sequence. 
Polynucleotides used in the microarray may be oligonucleotides specific to a gene or genes of 
5 interest. Oligonucleotides can also be specific to one or more unidentified cDNAs associated 
with a particular cell type or tissue type. It may be appropriate to use pairs of 
oligonucleotides on a microarray. The first oligonucleotide in each pair differs from the 
second oligonucleotide by one nucleotide. This nucleotide is preferably located in the center 
of the sequence. The second oligonucleotide serves as a control. The number of 

10 oligonucleotide pairs may range from about 2 to 1,000,000. 

In order to produce oligonucleotides for use on a microarray, the gene of interest is 
examined using a computer algorithm which starts at the 5' end, or, more preferably, at the 3' 
end of the nucleotide sequence. The algorithm identifies oligomers of defined length that are 
unique to the gene, have a GC content within a range suitable for hybridization, and lack 

15 secondary structure that may interfere with hybridization. In one aspect, the oligomers may 
be synthesized on a substrate using a light-directed chemical process. (See, e.g., Chee et al., 
supra .l The substrate may be any suitable solid support, e.g., paper, nylon, any other type of 
membrane, or a filter, chip, or glass slide. 

In another aspect, the oligonucleotides may be synthesized on the surface of the substrate 

20 using a chemical coupling procedure and an ink jet application apparatus. (See, e.g., 

Baldeschweiler et al. (1995) PCT application W095/251 116.) An array analogous to a dot or 
slot blot (HYBRIDOT® apparatus, GlBCO/BRL) may be used to arrange and link cDNA 
fragments or oligonucleotides to the surface of a substrate using a vacuum system or thermal, 
UV, mechanical, or chemical bonding procedures. An array may also be produced by hand or 

25 by using available devices, materials, and machines, e.g. Brinkmann® multichannel pipettors 
or robotic instruments. The array may contain from 2 to 1,000,000 or any other feasible 
number of oligonucleotides. 

In order to conduct sample analysis using the microarrays, polynucleotides are extracted 
from a sample. The sample may be obtained from any bodily fluid, e.g., blood, urine, saliva, 

30 phlegm, gastric juices, cultured cells, biopsies, or other tissue preparations. To produce 
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probes, the polynucleotides extracted from the sample are used to produce nucleic acid 
sequences complementary to the nucleic acids on the microarray. If the microarray contains 
cDNAs, antisense RNAs (aRNAs) are appropriate probes. Therefore, in one aspect, mRNA is 
reverse-transcribed to cDNA. The cDNA, in the presence of fluorescent label, is used to 
5 produce fragment or oligonucleotide aRNA probes. The fluorescently labeled probes are 

incubated with the microarray so that the probes hybridize to the microarray oligonucleotides. 
Nucleic acid sequences used as probes can include polynucleotides, fragments, and 
complementary or antisense sequences produced using restriction enzymes, PCR, or other 
methods known in the art. 
10 Hybridization conditions can be adjusted so that hybridization occurs with varying 

degrees of complementarity. A scanner can be used to determine the levels and patterns of 
fluorescence after removal of any nonhybridized probes. The degree of complementarity and 
; ; the relative abundance of each oligonucleotide sequence on the microarray can be assessed 
_5 through analysis of the scanned images. A detection system may be used to measure the 
l :|l5 absence, presence, or level of hybridization for any of the sequences. (See, e.g., Heller, R.A. 

et ah (1997) Proc. Natl. Acad. Sci. 94:2150-2155.) 
■ , j In another embodiment of the invention, nucleic acid sequences encoding SIGP may be 

5 used to generate hybridization probes useful in mapping the naturally occurring genomic 
■2 sequence. The sequences may be mapped to a particular chromosome, to a specific region of 
20 a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes 
(HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), 
bacterial PI constructions, or single chromosome cDNA libraries. (See, e.g., Price, CM. 
(1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet. 7:149-154.) 
Fluorescent in situ hybridization (FISH) may be correlated with other physical 
25 chromosome mapping techniques and genetic map data. (See, e.g., Heinz-Ulrich, et al. 

(1995) in Meyers, R.A. (ed.) Molecular Biology and Biotechnology . VCH Publishers New 
York, NY, pp. 965-968.) Examples of genetic map data can be found in various scientific 
journals or at the Online Mendelian Inheritance in Man (OMIM) site. Correlation between 
the location of the gene encoding SIGP on a physical chromosomal map and a specific 
30 disorder, or a predisposition to a specific disorder, may help define the region of DNA 
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associated with that disorder. The nucleotide sequences of the invention may be used to 
detect differences in gene sequences among normal, carrier, and affected individuals. 

In situ hybridization of chromosomal preparations and physical mapping techniques, 
such as linkage analysis using established chromosomal markers, may be used for extending 
5 genetic maps. Often the placement of a gene on the chromosome of another mammalian 
species, such as mouse, may reveal associated markers even if the number or arm of a 
particular human chromosome is not known. New sequences can be assigned to 
chromosomal arms by physical mapping. This provides valuable information to investigators 
searching for disease genes using positional cloning or other gene discovery techniques. 

10 Once the disease or syndrome has been crudely localized by genetic linkage to a particular 
genomic region, e.g., AT to 1 lq22-23, any sequences mapping to that area may represent 
associated or regulatory genes for further investigation. (See ? e.g., Gatti, R.A. et al. (1988) 
Nature 336:577-580.) The nucleotide sequence of the subject invention may also be used to 
detect differences in the chromosomal location due to translocation, inversion, etc., among 

15 normal, carrier, or affected individuals. 

In another embodiment of the invention, SIGP, its catalytic or immunogenic fragments, 
or oligopeptides thereof can be used for screening libraries of compounds in any of a variety 
of drug screening techniques. The fragment employed in such screening may be free in 
solution, affixed to a solid support, borne on a cell surface, or located intracellularly. The 

20 formation of binding complexes between SIGP and the agent being tested may be measured. 
Another technique for drug screening provides for high throughput screening of 
compounds having suitable binding affinity to the protein of interest. (See, e.g., Geysen, et 
al. (1 984) PCT application WO84/03564.) In this method, large numbers of different small 
test compounds are synthesized on a solid substrate, such as plastic pins or some other 

25 surface. The test compounds are reacted with SIGP, or fragments thereof, and washed. 

Bound SIGP is then detected by methods well known in the art. Purified SIGP can also be 
coated directly onto plates for use in the aforementioned drug screening techniques. 
Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize 
it on a solid support. 

30 In another embodiment, one may use competitive drug screening assays in which 
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neutralizing antibodies capable of binding SIGP specifically compete with a test compound 
for binding SIGP. In this manner, antibodies can be used to detect the presence of any 
peptide which shares one or more antigenic determinants with SIGP. 

In additional embodiments, the nucleotide sequences which encode SIGP may be used in 
5 any molecular biology techniques that have yet to be developed, provided the new techniques 
rely on properties of nucleotide sequences that are currently known, including, but not limited 
to, such properties as the triplet genetic code and specific base pair interactions. 

The examples below are provided to illustrate the subject invention and are not included 
for the purpose of limiting the invention. 

10 

r EXAMPLES 

For purposes of example, the preparation and sequencing of the SPLNNOT04 cDNA 
O library, from which Incyte Clones 1534876 and 1559131 were isolated, is described. 
^ Preparation and sequencing of cDNAs in libraries in the LIFESEQ™ database have varied 

over time, and the gradual changes involved use of kits, plasmids, and machinery available at 
*• the particular time the library was made and analyzed. 



f L SPLNNOT04 cDNA Library Construction 

? The SPLNNOT04 cDNA library was constructed from microscopically normal spleen 

20 tissue obtained from a 2-year-old Hispanic male who died of cerebral anoxia. The patient's 
serologies and past medical history were negative. 

The frozen tissue was homogenized and lysed using a Brinkmann Homogenizer Polytron 
PT-3000 (Brinkmann Instruments, Westbury, NJ) in guanidinium isothiocyanate solution. 
The lysate was centrifuged over a 5.7 M CsCl cushion using an Beckman SW28 rotor in a 

25 Beckman L8-70M Ultracentrifuge (Beckman Instruments) for 18 hours at 25,000 rpm at 

ambient temperature. The RNA was extracted with acid phenol pH 4.0, precipitated using 0.3 
M sodium acetate and 2.5 volumes of ethanol, resuspended in RNAse-free water and DNase 
treated at 37 °C. The RNA extraction and precipitation were repeated as before. The mRNA 
was then isolated using the Qiagen Oligotex kit (QIAGEN Inc., Chatsworth, CA) and used to 

30 construct the cDNA library. 
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The mRNA was handled according to the recommended protocols in the Superscript 
plasmid system (Cat. #18248-013, Gibco-BRL, Gaithersburg, MD). cDNA synthesis was 
initiated with a Notl-oligo d(T) primer. Double-stranded cDNA was blunted, ligated to 
EcoRI adaptors, digested with NotI, fractionated on a Sepharose CL4B column (Cat. 
5 #275 1 05-01 , Pharmacia), and those cDNAs exceeding 400 bp were ligated into the NotI and 
EcoRI sites of the pINCY 1 vector (Incyte). The plasmid pINCY 1 was subsequently 
transformed into DH5a™ competent cells (Cat. #18258-012, Gibco-BRL). 

II Isolation and Sequencing of cDNA Clones 

10 Plasmid cDNA was released from the cells and purified using the REAL Prep 96 plasmid 

^ kit (Catalog #26173, QIAGEN). The recommended protocol was employed except for the 
r : following changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth (Catalog 
O #2271 1, Gibco-BRL) with carbenicillin at 25 mg/L and glycerol at 0.4%; 2) after inoculation, 
j t, the cultures were incubated for 19 hours and at the end of incubation, the cells were lysed 
fl5 with 0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA 
B pellet was resuspended in 0.1 ml of distilled water. After the last step in the protocol, 
n$ samples were transferred to a 96- well block for storage at 4° C. 

;'7 cDNAs were sequenced according to the method of Sanger et al. (1975, J. MoL Biol. 

3 94:441f), using the Perkin Elmer Catalyst 800 or a Hamilton Micro Lab 2200 (Hamilton, 
20 Reno, NV) in combination with Peltier Thermal Cyclers (PTC200 from MJ Research, 
Watertown, MA) and Applied Biosy stems 377 DNA Sequencing Systems or the Perkin 
Elmer 373 DNA Sequencing System and the reading frame was determined. 

HI. Homology Searching of cDNA Clones and Their Deduced Proteins 

25 The nucleotide sequences and/or amino acid sequences of the Sequence Listing were 

used to query sequences in the GenBank, SwissProt, BLOCKS, and Pima II databases. These 
databases, which contain previously identified and annotated sequences, were searched for 
regions of homology using BLAST (Basic Local Alignment Search Tool). (See, e.g., 
Altschul, S.F. (1993) J. Mol. Evol 36:290-300; and Altschul et al. (1990) J. Mol. Biol. 

30 215:403-410.) 
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BLAST produced alignments of both nucleotide and amino acid sequences to determine 
sequence similarity. Because of the local nature of the alignments, BLAST was especially 
useful in determining exact matches or in identifying homologs which may be of prokaryotic 
(bacterial) or eukaryotic (animal, fungal, or plant) origin. Other algorithms could have been 
used when dealing with primary sequence patterns and secondary structure gap penalties. 
(See, e.g., Smith, T. et al. (1992) Protein Engineering 5:35-51.) The sequences disclosed in 
this application have lengths of at least 49 nucleotides and have no more than 12% uncalled 
bases (where N is recorded rather than A, C, G, or T). 

The BLAST approach searched for matches between a query sequence and a database 
sequence. BLAST evaluated the statistical significance of any matches found, and reported 
only those matches that satisfy the user-selected threshold of significance. In this application, 
threshold was set at 10" 25 for nucleotides and 10" 8 for peptides. 

Incyte nucleotide sequences were searched against the GenBank databases for primate 
(pri), rodent (rod), and other mammalian sequences (mam), and deduced amino acid 
sequences from the same clones were then searched against GenBank functional protein 
databases, mammalian (mamp), vertebrate (vrtp), and eukaryote (eukp), for homology. 



IV. Northern Analysis 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which 
RNAs from a particular cell type or tissue have been bound. (See, e.g., Sambrook, sjjpra, ch. 
7; and Ausubel, F.M. et al. supra , ch. 4 and 16.) 

Analogous computer techniques applying BLAST are used to search for identical or 
related molecules in nucleotide databases such as GenBank or LIFESEQ™ database (Incyte 
Pharmaceuticals). This analysis is much faster than multiple membrane-based hybridizations. 
In addition, the sensitivity of the computer search can be modified to determine whether any 
particular match is categorized as exact or homologous. 

The basis of the search is the product score, which is defined as: 

% sequence identity x % maximum BLAST score 

100 
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The product score takes into account both the degree of similarity between two sequences and 
the length of the sequence match. For example, with a product score of 40, the match will be 
exact within a 1% to 2% error, and, with a product score of 70, the match will be exact. 
Homologous molecules are usually identified by selecting those which show product scores 

5 between 1 5 and 40, although lower scores may identify related molecules. 

The results of northern analysis are reported as a list of libraries in which the transcript 
encoding SIGP occurs. Abundance and percent abundance are also reported. Abundance 
directly reflects the number of times a particular transcript is represented in a cDNA library, 
and percent abundance is abundance divided by the total number of sequences examined in 

10 the cDNA library. 

V. Extension of SIGP Encoding Polynucleotides 

The nucleic acid sequence of one of the polynucleotides of the present invention was 
used to design oligonucleotide primers for extending a partial nucleotide sequence to full 

15 length. One primer was synthesized to initiate extension of an antisense polynucleotide, and 
the other was synthesized to initiate extension of a sense polynucleotide. Primers were used 
to facilitate the extension of the known sequence "outward" generating amplicons containing 
new unknown nucleotide sequence for the region of interest. The initial primers were 
designed from the cDNA using OLIGO 4.06 (National Biosciences, Plymouth, MN), or 

20 another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content 
of about 50% or more, and to anneal to the target sequence at temperatures of about 68 °C to 
about 72°C. Any stretch of nucleotides which would result in hairpin structures and primer- 
primer dimerizations was avoided. 

Selected human cDNA libraries (GlBCO/BRL) were used to extend the sequence. If 

25 more than one extension is necessary or desired, additional sets of primers are designed to 
further extend the known region. 

High fidelity amplification was obtained by following the instructions for the XL-PCR 
kit (Perkin Elmer) and thoroughly mixing the enzyme and reaction mix. PCR was performed 
using the Peltier Thermal Cycler (PTC200; M.J. Research, Watertown, MA), beginning with 

30 40 pmol of each primer and the recommended concentrations of all other components of the 
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kit, with the following parameters: 



Step 1 94° C for 1 min (initial denaturation) 

Step 2 65° C fori min 

Step 3 68° C for 6 min 

Step 4 94° C for 15 sec 

Step 5 65° C fori min 

Step 6 68° C for 7 min 

Step 7 Repeat steps 4 through 6 for an additional 15 cycles 

Step 8 94° C for 15 sec 

Step 9 65° C fori min 

Step 10 68° C for 7:15 min 

Step 1 1 Repeat steps 8 through 10 for an additional 12 cycles 

Step 12 72° C for 8 min 

Step 13 4 ° C (and holding) 



A 5 jA to 10 jA aliquot of the reaction mixture was analyzed by electrophoresis on a 
low concentration (about 0.6% to 0.8%) agarose mini-gel to determine which reactions were 
successful in extending the sequence. Bands thought to contain the largest products were 
excised from the gel, purified using QIAQuick™ (QIAGEN Inc., Chatsworth, CA), and 
trimmed of overhangs using Klenow enzyme to facilitate religation and cloning. 

After ethanol precipitation, the products were redissolved in 13 jA of ligation buffer, 
\jul T4-DNA ligase (15 units) and \\A T4 polynucleotide kinase were added, and the mixture 
was incubated at room temperature for 2 to 3 hours, or overnight at 16° C. Competent E. coli 
cells (in 40 /A of appropriate media) were transformed with 3 /A of ligation mixture and 
cultured in 80 /A of SOC medium. (See, e.g., Sambrook, supra, Appendix A, p. 2.) After 
incubation for one hour at 37° C, the E. coli mixture was plated on Luria Bertani (LB) agar 
(See, e.g., Sambrook, supra . Appendix A, p. 1) containing 2x Carb. The following day, 
several colonies were randomly picked from each plate and cultured in 150 /A of liquid 
LB/2x Carb medium placed in an individual well of an appropriate commercially-available 
sterile 96-well microtiter plate. The following day, 5 /A of each overnight culture was 
transferred into a non-sterile 96-well plate and, after dilution 1:10 with water, 5 /A from each 
sample was transferred into a PCR array. 

For PCR amplification, 18 jA of concentrated PCR reaction mix (3.3x) containing 4 
units of rTth DNA polymerase, a vector primer, and one or both of the gene specific primers 
used for the extension reaction were added to each well. Amplification was performed using 
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the following conditions: 

Stepl 94° C for 60 sec 

Step 2 94° C for 20 sec 

Step 3 55° C for 30 sec 

5 Step 4 72° C for 90 sec 

Step 5 Repeat steps 2 through 4 for an additional 29 cycles 

Step 6 72° C for 180 sec 

Step 7 4° C (and holding) 

10 Aliquots of the PCR reactions were run on agarose gels together with molecular 

weight markers. The sizes of the PCR products were compared to the original partial cDNAs, 
and appropriate clones were selected, ligated into plasmid, and sequenced. 

In like manner, the nucleotide sequence of one of the nucleotide sequences of the 
present invention were used to obtain 5' regulatory sequences using the procedure above, 

15 oligonucleotides designed for 5' extension, and an appropriate genomic library. 

VI. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from one of the nucleotide sequences of the present 
invention are employed to screen cDNAs, genomic DNAs, or mRNAs. Although the labeling 

20 of oligonucleotides, consisting of about 20 base pairs, is specifically described, essentially the 
same procedure is used with larger nucleotide fragments. Oligonucleotides are designed 
using state-of-the-art software such as OLIGO 4.06 (National Biosciences) and labeled by 
combining 50 pmol of each oligomer, 250 juCi of [y- 32 P] adenosine triphosphate (Amersham, 
Chicago, IL), and T4 polynucleotide kinase (DuPont NEN®, Boston, MA). The labeled 

25 oligonucleotides are substantially purified using a Sephadex G-25 superfine resin column 
(Pharmacia & Upjohn, Kalamazoo, MI). An aliquot containing 10 7 counts per minute of the 
labeled probe is used in a typical membrane-based hybridization analysis of human genomic 
DNA digested with one of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, Xba 1, or 
Pvu II (DuPont NEN, Boston, MA). 

30 The DNA from each digest is fractionated on a 0.7 percent agarose gel and 

transferred to nylon membranes (Nytran Plus, Schleicher & Schuell, Durham, NH). 
Hybridization is carried out for 16 hours at 40 °C. To remove nonspecific signals, blots are 
sequentially washed at room temperature under increasingly stringent conditions up to 0.1 x 
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saline sodium citrate and 0.5% sodium dodecyl sulfate. After XOMAT AR™ film (Kodak, 
Rochester, NY) is exposed to the blots to film for several hours, hybridization patterns are 
compared visually. 

5 VII, Microarrays 

To produce oligonucleotides for a microarray, one of the nucleotide sequences of the 
present invention is examined using a computer algorithm which starts at the 3 ! end of the 
nucleotide sequence. For each, the algorithm identifies oligomers of defined length that are 
unique to the nucleic acid sequence, have a GC content within a range suitable for 
10 hybridization, and lack secondary structure that would interfere with hybridization. The 

algorithm identifies approximately 20 oligonucleotides corresponding to each nucleic acid 
%Q sequence. For each sequence-specific oligonucleotide, a pair of oligonucleotides is 
p synthesized in which the first oligonucleotides differs from the second oligonucleotide by one 
S nucleotide in the center of the sequence. The oligonucleotide pairs can be arranged on a 
:% substrate, e.g. a silicon chip, using a light-directed chemical process. (See, e.g., Chee, supra .) 

In the alternative, a chemical coupling procedure and an ink jet device can be used to 
: |! synthesize oligomers on the surface of a substrate. (See, e.g., Baldeschweiler, supra.) An 
: ; f array analogous to a dot or slot blot may also be used to arrange and link fragments or 
3 oligonucleotides to the surface of a substrate using or thermal, UV, mechanical, or chemical 
20 bonding procedures, or a vacuum system. A typical array may be produced by hand or using 
available methods and machines and contain any appropriate number of elements. After 
hybridization, nonhybridized probes are removed and a scanner used to determine the levels 
and patterns of fluorescence. The degree of complementarity and the relative abundance of 
each oligonucleotide sequence on the microarray may be assessed through analysis of the 
25 scanned images. 

VIIL Complementary Polynucleotides 

Sequences complementary to the SIGP-encoding sequences, or any parts thereof, are 
used to detect, decrease, or inhibit expression of naturally occurring SIGP. Although use of 
30 oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same 
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procedure is used with smaller or with larger sequence fragments. Appropriate 
oligonucleotides are designed using Oligo 4.06 software and the coding sequence of SIGP. 
To inhibit transcription, a complementary oligonucleotide is designed from the most unique 
5' sequence and used to prevent promoter binding to the coding sequence. To inhibit 
5 translation, a complementary oligonucleotide is designed to prevent ribosomal binding to the 
SIGP-encoding transcript. 



IX. Expression of SIGP 

Expression of SIGP is accomplished by subcloning the cDNA into an appropriate 
10 vector and transforming the vector into host cells. This vector contains an appropriate 
promoter, e.g., fi-galactosidase upstream of the cloning site, operably associated with the 
cDNA of interest. (See, e.g.,Sambrook, supra , pp. 404-433; and Rosenberg, M. et al. (1983) 
Methods Enzymol. 101:123-138.) 

Induction of an isolated, transformed bacterial strain with isopropyl beta-D- 
1 5 thiogalactopyranoside (IPTG) using standard methods produces a fusion protein which 

consists of the first 8 residues of B-galactosidase, about 5 to 15 residues of linker, and the full 
length protein. The signal residues direct the secretion of SIGP into bacterial growth media 
which can be used directly in the following assay for activity. 

20 X. Production of SIGP Specific Antibodies 

SIGP substantially purified using PAGE electrophoresis (see, e.g., Harrington, M.G. 
(1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. The SIGP amino acid 
sequence is analyzed using DNASTAR software (DNASTAR Inc) to determine regions of 

25 high immunogenicity, and a corresponding oligopeptide is synthesized and used to raise 

antibodies by means known to those of skill in the art. Methods for selection of appropriate 
epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the 
art. (See, e.g., Ausubel et al. supra , ch. 11.) 

Typically, the oligopeptides are 15 residues in length, and are synthesized using an 

30 Applied Biosystems Peptide Synthesizer Model 43 1 A using fmoc-chemistry and coupled to 
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KLH (Sigma, St. Louis, MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide 
ester (MBS) to increase immunogenicity. (See, e.g., Ausubel et al. su^ra.) Rabbits are 
immunized with the oligopeptide-KLH complex in complete Freund's adjuvant. Resulting 
antisera are tested for antipeptide activity, for example, by binding the peptide to plastic, 
5 blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. 

XL Purification of Naturally Occurring SIGP Using Specific Antibodies 

Naturally occurring or recombinant SIGP is substantially purified by immunoaffinity 
10 chromatography using antibodies specific for SIGP. An immunoaffinity column is 

constructed by covalently coupling anti-SIGP antibody to an activated chromatographic resin, 
u5 such as CNBr-activated Sepharose (Pharmacia & Upjohn). After the coupling, the resin is 
q blocked and washed according to the manufacturer's instructions. 

It Media containing SIGP are passed over the immunoaffinity column, and the column 

J-Jfl5 is washed under conditions that allow the preferential absorbance of SIGP (e.g., high ionic 
; - strength buffers in the presence of detergent). The column is eluted under conditions that 
i disrupt antibody/SIGP binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a 
* v chaotrope, such as urea or thiocyanate ion), and SIGP is collected. 

20 XII. Identification of Molecules Which Interact with SIGP 

SIGP, or biologically active fragments thereof, are labeled with 125 I Bolton-Hunter 
reagent. (See, e.g., Bolton et al. (1973) Biochem. J. 133:529.) Candidate molecules 
previously arrayed in the wells of a multi-well plate are incubated with the labeled SIGP, 
washed, and any wells with labeled SIGP complex are assayed. Data obtained using different 

25 concentrations of SIGP are used to calculate values for the number, affinity, and association 
of SIGP with the candidate molecules. 

Various modifications and variations of the described methods and systems of the 
invention will be apparent to those skilled in the art without departing from the scope and 
spirit of the invention. Although the invention has been described in connection with specific 

30 preferred embodiments, it should be understood that the invention as claimed should not be 
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unduly limited to such specific embodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are obvious to those skilled in molecular biology 
or related fields are intended to be within the scope of the following claims. 
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(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: TO BE ASSIGNED 

(B) FILING DATE: HEREWITH 

(C) CLASSIFICATION: 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: BILLINGS, LUCY J. 

(B) REGISTRATION NUMBER: 36,74 9 

(C) REFERENCE /DOCKET NUMBER: PF-0459 US 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (650) 855-0555 

(B) TELEFAX: (650) 845-4166 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 
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PF-0459 QS 

(A) LIBRARY: HEARNOT01 

(B) CLONE: 305841 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 



Met 


Ala 


Ala 


Thr 


Leu 


Gly 


Pro 


Leu 


Gly 


Ser 


Trn 


Gin 


Gin 


Trn 
x ip 


Arg 










5 










10 










15 


Arg 


Cys 


Leu 


Ser 


Ala 


Arg 


Asp 


Gly 


Ser 


Arg 


Met 


Leu 


Leu 


Leu 


Leu 










20 










25 










30 


Leu 


Leu 


Leu 


Gly 


Ser 


Q]_y 


Gin 


Gly 


Pro 


Gin 


Gin 


Val 


Gly 


Ala 


Gly 










35 










40 










45 


p "1 n 


Th r 




\J-L Li 


iyz 


Leu 


ys 


Arg 


Glu 


His 


OCJ. 


Leu 


Ser 


L 

ys 


IT 1 {J 










50 










55 










60 


iyr 


pi n 

OXI1 


uiy 


V CL X 


Prl V 

kjm y 


Tin T~ 
1 ill. 


\3 X Y 


Car 


Q tr 1 


O ti X 


Leu 


irp 


.0.0 11 


T 

.Leu 


i v it; l. 






























7 R 




Sen 


r-ll ct 


i v lfc; L 


Vd-L 




-L 111 


Gin 


i yr 


Tip 

lie 


7\ r rr 
J-t-L y 


Leu 


T"h v 
l 111 


r I U 


7\ on 

" tr 










80 










85 










90 




bXIl 


Oav- 

oex 


Lys 


fin 


P1 \7 


ai ^ 

iAX d. 


Leu 


irp 


7\ o -n 
i-ioil 


ral y 


V d. 1 


P -r r> 


^y o 


Jr I it; 




















100 










105 


T 


r\J_ y 


rio p 


irp 


Pin 


Leu 


bill 


V d 1 


Hie; 

nio 


Ph (=> 


Lys 


T 1 *a 
lie 


m o 


Pl T7 

\j-Ly 


oil 1 










x x u 










1 1 R 

11J 










1 90 

1 iL \J 


Pi u 


Lys 


Lys 


Asn 


Leu 


nlS 


Pi u 


i-iSp 


P 1 W 

bxy 


Leu 


Aid 


x xe 


irp 


iyr 


Thr 










125 










130 










135 


Lys 


Asp 


Arg 


1 V I6 L 


pi n 
bxn 


Pro 


Pi 17 

bxy 


Pro 


v ax 


irne 


PI w 

bxy 


Asn 


i v je l. 


Asp 


Lys 










i & o 

i y u 










I 4 S 

II J 










1 SO 


rne 


vai 


p 1 t t 


Leu 


bxy 


v ax 


irne 


Val 


Asp 


i nr 


Tyr 


Pro 


Asn 


m n 
bXU 


Pin 
bXU 










IDJ 










1 £0 
lOU 










1 

IDj 


Lys 


bin 


bin 


Pin 


Arg 


vai 


irne 


Pro 


Tyr 


x xe 


Ser 


Hid 


lyjen 


v ax 


Asn 










1 / U 










X / 0 










i on 

X O U 


Asn 


bxy 


Q d 

oS IT 


Leu 


Ocl 


iyr 


rib p 


nlS 


CI n 
bXU 


7A v-/-y 


Asp 


PI \r 

bxy 


Arg 


rXO 


i nr 










1 R R 
lo J 










1 QO 
X i? U 










1 

±z/D 


p ~\ n 
bXU 


Leu 


Pi ,r 


P 1 17 

bXy 


Cys 


Tnr 


Hid 


x xe 


1 


Arg 


Asn 


Leu 




Tyr 


Asp 




















9 n r 










91 0 
Z1U 


i nr 


xr ne 


Leu 


vai 


x xe 


Arg 


Tyr 


v ax 


Lys 


Arg 


rii S 


Leu 


Thr 


x xe 


iyier 










ilO 










zzu 














Asp 


Tip 
lie 


Asp 


pi v 

bry 


Lys 


il 1 o 


kjX u. 


irp 


rlX y 




by o 


Tl A 

11c 




V cl X 










230 










235 










24 0 


Pr O 




V ci _L 


Arg 


Leu 


Dy-A 

riu 


ri.X y 


P 1 TT 


iyr 


iyr 


rile 


biy 


TK r- 
1 111 


Oar 


Q o v 
OKI 










245 










250 










255 


lie 


Thr 


Gly 


Asp 


Leu 


Ser 


Asp 


Asn 


His 


Asp 


Val 


He 


Ser 


Leu 


Lys 










260 










265 










270 


Leu 


Phe 


Glu 


Leu 


Thr 


Val 


Glu 


Arg 


Thr 


Pro 


Glu 


Glu 


Glu 


Lys 


Leu 










275 










280 










285 


His 


Arg 


Asp 


Val 


Phe 


Leu 


Pro 


Ser 


Val 


Asp 


Asn 


Met 


Lys 


Leu 


Pro 










290 










295 










300 


Glu 


Met 


Thr 


Ala 


Pro 


Leu 


Pro 


Pro 


Leu 


Ser 


Gly 


Leu 


Ala 


Leu 


Phe 










305 










310 








315 


Leu 


He 


Val 


Phe 


Phe 


Ser 


Leu 


Val 


Phe 


Ser 


Val 


Phe 


Ala 


He 


Val 










320 










325 










330 


He 


Gly 


He 


He 


Leu 


Tyr 


Asn 


Lys 


Trp 


Gin 


Glu 


Gin 


Ser 


Arg 


Lys 










335 










340 










345 


Arg 


Phe 


Tyr 


























(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 




2: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: EOSIHET02 

(B) CLONE : 322866 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 2 


; 








Met 


Gly 


Met 


Ser 


Ser 


Leu 


Lys 


Leu 


Leu 


Lys 


T\7T 


Val 


Leu 


Phe 


Phe 








rr 

D 










i n 

J. u 










15 


Phe 


Asn 


Leu 


Leu 


Phe 


Trp 


_l j_e 


Cys 


Gly 






He 


Leu 


Gly 


Phe 










zu 


















30 


Gly 


He 


Tyr 


Leu 


Leu 


± le 


nlS 


7\ cr Tt 


Asn 


Phe 


Q]_y 


Val 


Leu 


Phe 


His 






35 










a n 










45 


Asn 


Leu 


Pro 


Ser 


Leu 


Thr 


Leu 


{j±y 


Asn 


1 

V d_L 


Phe 


Val 


He 


Val 


Gly 










50 










3 3 










60 


Ser 


He 


He 


Met 


Val 


val 


Ala 


tr ne 


Leu 


vj_Ly 


uyo 


Met 


Gly 


Ser 


He 










65 




















75 


Lys 


Glu 


Asn 


Lys 


Cys 


Leu 


Leu 


Met 


Ser 


Phe 


Phe 


He 


Leu 


Leu 


Leu 








80 










85 










90 


He 


He 


Leu 


Leu 


Ala 


Glu 


Val 


Thr 


Leu 


Ala 


He 


Leu 


Leu 


Phe 


Val 










95 










100 










105 


Tyr 


Glu 


Gin 


Lys 


Leu 


Asn 


Glu 


Tyr 


Val 


Ala 


Lys 


Gly 


Leu 


Thr 


Asp 






110 










115 










120 


Ser 


He 


His 


Arg 


Tyr 


His 


Ser 


Asp 


Asn 


Ser 


Thr 


Lys 


Ala 


Ala 


Trp 








125 










130 










135 


Asp 


Ser 


He 


Gin 


Ser 


Phe 


Leu 


Gin 


Cys 


Cys 


Gly 


He 


Asn 


Gly 


Thr 








140 










145 










150 


Ser 


Asp 


Leu 


Asp 


Ser 


Gly 


Ser 


Pro 


Ala 


Ser 


Cys 


Pro 


Ser 


Asp 


Arg 








155 










160 










165 


Lys 


Val 


Glu 


Gly 


Cys 


Tyr 


Ala 


Lys 


Glu 


Asp 


Phe 


Gly 


Phe 


He 


Gin 






170 










175 










180 


Phe 


Pro 


Val 


Tyr 


Arg 


Asn 


His 


His 


His 


Leu 


Cys 


Met 


Cys 


Asp 










185 










190 














(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 342 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BEPINOT01 

(B) CLONE: 546656 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 : 

Met Ser Leu His Gly Lys Arg Lys Glu He Tyr Lys Tyr Glu Ala 
5 10 15 

Pro Trp Thr Val Tyr Ala Met Asn Trp Ser Val Arg Pro Asp Lys 

20 25 30 

Arg Phe Arg Leu Ala Leu Gly Ser Phe Val Glu Glu Tyr Asn Asn 

35 40 45 

Lys Val Gin Leu Val Gly Leu Asp Glu Glu Ser Ser Glu Phe He 

50 55 60 
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Cys 


Arg 


Asn 


Thr 


Phe 


Asp 


His 


Pro 


Tvr 


Pro 


Thr 


Thr 


Lys 


Leu 


Met 










65 










7 0 










75 




He 


Pro 


Asp 


Thr 


Lys 


Gly 


Val 


Tvr 


Pro 


Asp 


Leu 


Leu 


Ala 


Thr 










80 










85 










90 


Ser 


Gly 


Asp 


T vr 


Leu. 


Arg 


Val 


Tro 


Arg 


Val 


Gly 


Glu 


Thr 


Glu 


Thr 










95 










100 










105 


Arg 


Leu. 


Glu 


Cys 


Leu 


Leu 


Asn 


Asn 


Asn 


Lys 


Asn 


Ser 


Asp 


Phe 


Cys 










110 










115 










120 


Ala 


Pro 


Leu 


Thr 


Ser 


Phe 


Asp 


TrD 


Asn 


Glu 


Val 


Asp 


Pro 


Tvr 


Leu 










125 










130 










135 


Leu 


Gly 


Thr 


Ser 


Ser 


He 


Asp 


Thr 


Thr 


Cys 


Thr 


He 


Tro 


Gly 


Leu 










14 0 










145 










150 




1 I1X. 


uiy 


Gin 


Val 


Leu 


Gly 


Arg 


Val 


Asn 


Leu 


Val 


Ser 


Gly 


His 










i j j 










160 










165 


Val 


Lys 


Thr 


Gin 


Leu 


He 


Ala 


His 


Asp 


Lys 


Glu 


Val 


Tvr 


Asp 


He 










170 










175 










180 


Ala 


Phe 


Ser 


Arg 


Ala 


Gly 


Gly 


Gly 


Arg 


Asp 


Met 


Phe 


Ala 


Ser 


Val 










185 










190 










195 


Gly 


Ala 


Asp 


Gly 


Ser 


Val 


Arg 


Met 


Phe 


Asp 


Leu 


Arg 


His 


Leu 


Glu 










900 

jC. \J V 










205 










210 


His 




± IIjL 


He 


lie 


lyr 


ul U 


rio p 






n j- o 


n_L o 


i: J- \J 


Leu 


Leu 




















220 










225 


7\ -y~rf 
r\±- y 


Leu 


oy o 


i rp 


Aon 


Lys 


m n 

ulU 


rib j-J 


Pro 


Asn 


i 


Leu 


Ala 


Thr 


Met 










9 "^0 

4i -J> U 










9 ^ R 
z. o .j 










94 0 

/L H \J 


nld 




rio p 


uiy 






Val 


Val 


He 




Asp 


Val 


Arg 


Val 


Pro 










OA R 










9^0 










9 S S 
^ -j *j 


<^ys 


1 Hr 


"D >- r-\ 

irro 


v ax 




A -K- rr 

niy 


Leu 






ill Jb 


7\ r rr 


ai a 

i-iJ. cl 


fire 

oy o 


V d J_ 












9 ^ n 










9 £S 
z U J 










97 n 




He 


Ala 


i rp 


Ala 


Pro 


His 


Ser 


Ser 


Cys 


His 


He 


Cys 


Thr 


Ala 






275 










280 










285 


Ala 


Asp 


Asp 


His 


Gin 


Ala 


Leu 


He 


Trp 


Asp 


He 


Gin 


Gin 


Met 


Pro 










9 QD 




















^00 


Arg 


Ala 


He 


Glu 


Asp 


Pro 


He 


Leu 


Ala 


Tyr 


Thr 


Ala 


Glu 


Gly 


Glu 










305 










310 










315 


lie 


Asn 


Asn 


Val 


Gin 




Ala 


Ser 


Thr 


Gin 


Pro 


Asp 




He 


Ala 










320 










325 










330 


He 


Cys 


Tyr 


Asn 


Asn 


Cys 


Leu 


Glu 


He 


Leu 


Arg 


Val 
















335 










340 












(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 




4 : 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SYNORAT0 3 

(B) CLONE: 693453 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 : 

Met Glu Glu Leu Asp Gly Glu Pro Thr Val Thr Leu He Pro Gly 

5 10 15 

Val Asn Ser Lys Lys Asn Gin Met Tyr Phe Asp Trp Gly Pro Gly 

20 25 30 

Glu Met Leu Val Cys Glu Thr Ser Phe Asn Lys Lys Glu Lys Ser 
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35 



40 



45 



Glu Met Val Pro Ser Cys Pro Phe He Tyr He He Arg Lys Asp 
50 55 60 

Val Asp Val Tyr Ser Gin He Leu Arg Lys Leu Phe Asn Glu Ser 
65 70 75 

His Gly He Phe Leu Gly Leu Gin Arg He Asp Glu Glu Leu Thr 
80 85 90 

Gly Lys Ser Arg Lys Ser Gin Leu Val Arg Val Ser Lys Asn Tyr 
95 100 105 

Arq Ser Val He Arg Ala Cys Met Glu Glu Met His Gin Val Ala 
110 115 120 

He Ala Ala Lys Asp Pro Ala Asn Gly Arg Gin Phe Ser Ser Gin 
125 130 135 

Val Ser He Leu Ser Ala Met Glu Leu He Trp Asn Leu Cys Glu 
140 145 150 

He Leu Phe He Glu Val Ala Pro Ala Gly Pro Leu Leu Leu His 
155 160 165 

Leu Leu Asp Trp Val Arg Leu His Val Cys Glu Val Asp Ser Leu 
170 175 180 

Ser Ala Asp Val Leu Gly Ser Glu Asn Pro Ser Lys His Asp Ser 
185 190 195 

Phe Trp Asn Leu Val Thr He Leu Val Leu Gin Gly Arg Leu Asp 
200 205 210 

Glu Ala Arg Gin Met Leu Ser Lys Glu Ala Asp Ala Ser Pro Ala 
215 220 225 

Ser Ala Gly He Cys Arg He Met Gly Asp Leu Met Arg Thr Met 
230 235 240 

Pro He Leu Ser Pro Gly Asn Thr Gin Thr Leu Thr Glu Leu Glu 
245 250 255 

Leu Lys Trp Gin His Trp His Glu Glu Cys Glu Arg Tyr Leu Gin 
260 265 270 

Asp Ser Thr Phe Ala Thr Ser Pro His Leu Glu Ser Leu Leu Lys 
275 280 285 

He Met Leu Gly Asp Glu Ala Ala Leu Leu Glu Gin Lys Glu Leu 
290 295 300 

Leu Ser Asn Trp Tyr His Phe Leu Val Thr Arg Leu Leu Tyr Ser 
305 310 315 

Asn Pro Thr Val Lys Pro He Asp Leu His Tyr Tyr Ala Gin Ser 
320 325 330 

Ser Leu Asp Leu Phe Leu Gly Gly Glu Ser Ser Pro Glu Pro Leu 
335 340 345 

Asp Asn He Leu Leu Ala Ala Phe Glu Phe Asp He His Gin Val 
350 355 360 

He Lys Glu Cys Ser He Ala Leu Ser Asn Trp Trp Phe Val Ala 
365 370 375 

His Leu Thr Asp Leu Leu Asp His Cys Lys Leu Leu Gin Ser His 
380 385 390 

Asn Leu Tyr Phe Gly Ser Asn Met Arg Glu Phe Leu Leu Leu Glu 
395 400 405 

Tyr Ala Ser Gly Leu Phe Ala His Pro Ser Leu Trp Gin Leu Gly 
410 415 420 

Val Asp Tyr Phe Asp Tyr Cys Pro Glu Leu Gly Arg Val Ser Leu 
425 430 435 

Glu Leu His He Glu Arg He Pro Leu Asn Thr Glu Gin Lys Ala 
440 445 450 

Leu Lys Val Leu Arg He Cys Glu Gin Arg Gin Met Thr Glu Gin 
455 460 465 

Val Arg Ser He Cys Lys He Leu Ala Met Lys Ala Val Arg Asn 
470 475 480 

Asn Arg Leu Gly Ser Ala Leu Ser Trp Ser He Arg Ala Lys Asp 



485 



490 



495 
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Ala 


Ala 


Phe 


Ala 


Thr 


Leu 


Val 


Ser 










500 








Cys 


Glu 


Arg 


Gly 


Cys 


Phe 


Ser 


Asp 










515 








Gly 


Pro 


Ala 


Met 


Met 


Leu 


Ser 


Asp 








530 








Tyr 


Arg 


Glu 


Phe 


His 


Arg 


Met 


Tyr 








545 








Ala 


Aia 


Ser 


Leu 


Leu 


Leu 


Ser 


Leu 










560 








Arg 


Sex 


Phe 


Trp 


Met 


Thr 


Leu 


Leu 








575 








Glu 


Gin 


Lys 


Gin 


Val 


He 


Phe 


Ser 










590 








Met 


Arg 


Cys 


Leu 


Glu 


Asp 


Leu 


Thr 










605 








Glu 


Ser 


Asp 


Thr 


Glu 


Gin 


Leu 


Gin 










620 








Lys 


Val 


Glu 


Met 


Leu 


Arg 


Leu 


Ser 








635 








Ala 


He 


He 


Arg 


Glu 


Gly 


Ser 


Leu 










650 









Asp 


Arg 


Phe 


Leu 


Arg 


Asp 


Tyr 


505 










510 


Leu 


Asp 


Leu 


He 


As P 


Asn 


Leu 




520 










525 


Arg 


Leu 


Thr 


Phe 


Leu 


Gly 


Lys 


535 










540 


Gly 


Glu 


Lys 


Arg 


Phe 


Ala 


Asp 


550 










555 


Met 


Thr 


Ser 


Arg 


He 


Ala 


Pro 




565 










570 


Thr 


Asp 


Ala 


Leu 


Pro 


Leu 


Leu 




580 










585 


Ala 


Glu 


Gin 


Thr 


Tyr 


Glu 


Leu 




595 










600 


Ser 


Arg 


Arg 


Pro 


Val 


His 


Gly 




610 










615 


Asp 


Asp 


Asp 


He 


Glu 


Thr 


Thr 




625 










630 


Leu 


Ala 


Arg 


Asn 


Leu 


Ala 


Arg 




640 










645 


Glu 


Gly 


Ser 












655 













(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT03 

(B) CLONE: 866885 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 : 



Met 


Ala 


Pro 


Asp 


Pro 


Trp 


Phe 


Ser 


Thr 


Tyr 


Asp 


Ser 


Thr 


Cys 


Gin 








5 










10 










15 


He 


Ala 


Gin 


Glu 


He 


Ala 


Glu 


Lys 


He 


Gin 


Gin 


Arg 


Asn 


Gin 


Tyr 








20 








25 










30 


Glu 


Arg 


Lys 


Gly 


Glu 


Lys 


Ala 


Pro 


Lys 


Leu 


Thr 


Val 


Thr 


He 


Arg 






35 










40 










45 


Ala 


Leu 


Leu 


Gin 


Asn 


Leu 


Lys 


Glu 


Lys 


He 


Ala 


Leu 


Leu 


Lys 


Asp 










50 








55 










60 


Leu 


Leu 


Leu 


Arg 


Ala 


Val 


Ser 


Thr 


His 


Gin 


He 


Thr 


Gin 


Leu 


Glu 








65 










70 










75 


Gly 


Asp 


Arg 


Arg 


Gin 


Asn 


Leu 


Leu 


Asp 


Asp 


Leu 


Val 


Thr 


Arg 


Glu 






80 










85 










90 


Arg 


Leu 


Leu 


Leu 


Ala 


Ser 


Phe 


Lys 


Asn 


Glu 


Gly Ala 


Glu 


Pro 


Asp 








95 










100 










105 


Leu 


He 


Arg 


Ser 


Ser 


Leu 


Met 


Ser 


Glu 


Glu 


Ala 


Lys 


Arg 


Gly 


Ala 








110 










115 










120 


Pro 


Asn 


Pro 


Trp 


Leu 


Phe 


Glu 


Glu 


Pro 


Glu 


Glu 


Thr 


Arg 


Gly 


Leu 








125 










130 










135 


Gly 


Phe 


Asp 


Glu 


He 


Arg 


Gin 


Gin 


Gin 


Gin 


Lys 


He 


He 


Gin 


Glu 






140 










145 










150 


Gin Asp 


Ala 


Gly 


Leu 


Asp 


Ala 


Leu 


Ser 


Ser 


He 


He 


Ser 


Arg 


Gin 
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155 










loO 










1UJ 


Lys 


Gin 


Met 


Gly 


Gin 


Glu 


He 


Gly 


Asn 


Glu 


Leu 


Asp 


C 1 n 


bill 


7\ c? n 






170 










175 










180 


Glu 


He 


He 


Asp 


Asp 


Leu 


Ala 


Asn 


Leu 


Val 


GlU 


Asn 


i nr 


riSp 










185 










190 










195 


Lys 


Leu 


Arg 


Asn 


Glu 


Thr 


Arg 


Arg 


Val 


Asn 


Met 


Val 


Asp 


Arg 


Lys 






200 










205 










210 


Ser 


Ala 


Ser 


Cys 


Gly 


Met 


He 


Met 


Val 


He 


Leu 


Leu 


Leu 


Leu 


Val 








215 










220 










225 


Ala 


He 


Val 


Val 


Val 
230 


Ala 


Val 


Trp 


Pro 


Thr 
235 


Asn 











(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT03 

(B) CLONE; 1242271 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 : 



Met 


Leu 


Leu 


Asp 


Thr 


Val 


Gin 


Lys 


Val 


Phe 


Gin 


Lys 


Met 


Leu 


Glu 








5 










10 










15 


Cys 


He 


Ala 


Arg 


Ser 


Phe 


Arg 


Lys 


Gin 


Pro 


Glu 


Glu 


Gly 


Leu 


Arg 






20 










25 










30 


Leu 


Leu 


Tyr 


Ser 


Val 


Gin 


Arg 


Pro 


Leu 


His 


Glu 


Phe 


He 


Thr 


Ala 








35 










40 










45 


Val 


Gin 


Ser 


Arg 


His 


Thr 


Asp 


Thr 


Pro 


Val 


His 


Arg 


Gly 


Val 


Leu 








50 










55 










60 


Ser 


Thr 


Leu 


He 


Ala 


Gly 


Pro 


Val 


Val 


Glu 


He 


Ser 


His 


Gin 


Leu 










65 








70 










75 


Arg 


Lys 


Val 


Ser 


Asp 
80 


Val 


Glu 


Glu 


Leu 


Thr 
85 


Pro 


Pro 


Glu 


His 


Leu 
90 


Ser 


Asp 


Leu 


Pro 


Pro 


Phe 


Ser 


Arg 


Cys 


Leu 


He 


Gly 


He 


He 


He 








95 










100 










105 


Lys 


Ser 


Ser 


Asn 


Val 


Val 


Arg 


Ser 


Phe 


Leu 


Asp 


Glu 


Leu 


Lys 


Ala 








110 










115 










120 


Cys 


Val 


Ala 


Ser 


Asn 


Asp 


He 


Glu 


Gly 


He 


Val 


Cys 


Leu 


Thr 


Ala 








125 










130 










135 


Ala 


Val 


His 


He 


He 


Leu 


Val 


He 


Asn 


Ala 


Gly 


Lys 


His 


Lys 


Ser 








140 










145 










150 


Ser 


Lys 


Val 


Arg 


Glu 


Val 


Ala 


Ala 


Thr 


Val 


His 


Arg 


Lys 


Leu 


Lys 






155 










160 










165 


Thr 


Phe 


Met 


Glu 


He 


Thr 


Leu 


Glu 


Glu 


Asp 


Ser 


He 


Glu 


Arg 


Phe 








170 










175 










180 


Leu 


Tyr 


Glu 


Ser 


Ser 


Ser 


Arg 


Thr 


Leu 


Gly 


Glu 


Leu 


Leu 


Asn 


Ser 








185 










190 










195 



124 




PF-0459 US 

(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 608 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGFET03 

(B) CLONE: 1255027 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 : 

Met Thr Lys Thr Asp Glu Thr Thr Leu Val Ala Ser Trp Glu Thr 
5 10 15 

Arg Glu Lys Thr Ala Lys Thr Thr Leu Phe Leu Pro Leu Glu Phe 

20 25 30 

Trp Ser Tyr Lys Ala Glu Val Pro His Leu Pro Glu Leu Ala Tyr 

35 40 45 

Ser Ala Arg Ser Lys Met Ala Glu Leu Asn Thr His Val Asn Val 

50 55 60 

Lvs Glu Lys He Tyr Ala Val Arg Ser Val Val Pro Asn Lys Ser 
65 70 75 

Asn Asn Glu He Val Leu Val Leu Gin Gin Phe Asp Phe Asn Val 
80 85 90 

Asp Lys Ala Val Gin Ala Phe Val Asp Gly Ser Ala He Gin Val 
95 100 105 

Leu Lys Glu Trp Asn Met Thr Gly Lys Lys Lys Asn Asn Lys Arg 

110 115 120 

Lvs Arq Ser Lys Ser Lys Gin His Gin Gly Asn Lys Asp Ala Lys 

125 130 135 

Asp Lys Val Glu Arg Pro Glu Ala Gly Pro Leu Gin Pro Gin Pro 

140 145 150 

Pro Gin He Gin Asn Gly Pro Met Asn Gly Cys Glu Lys Asp Ser 

155 160 165 

Ser Ser Thr Asp Ser Ala Asn Glu Lys Pro Ala Leu He Pro Arg 

170 175 180 

Glu Lys Lys He Ser He Leu Glu Glu Pro Ser Lys Ala Leu Arg 

185 190 195 

Gly Val Thr Glu Gly Asn Arg Leu Leu Gin Gin Lys Leu Ser Leu 

200 205 210 

Asp Gly Asn Pro Lys Pro He His Gly Thr Thr Glu Arg Ser Asp 

215 220 225 

Gly Leu Gin Trp Ser Ala Glu Gin Pro Cys Asn Pro Ser Lys Pro 

230 235 240 

Lys Ala Lys Thr Ser Pro Val Lys Ser Asn Thr Pro Ala Ala His 

245 250 255 

Leu Glu He Lys Pro Asp Glu Leu Ala Lys Lys Arg Gly Pro Asn 

260 265 270 

He Glu Lys Ser Val Lys Asp Leu Gin Arg Cys Thr Val Ser Leu 

275 280 285 

Thr Arg Tyr Arg Val Met He Lys Glu Glu Val Asp Ser Ser Val 

290 295 300 

Lys Lys He Lys Ala Ala Phe Ala Glu Leu His Asn Cys He He 

305 310 315 

Asp Lys Glu Val Ser Leu Met Ala Glu Met Asp Lys Val Lys Glu 

320 325 330 

Glu Ala Met Glu He Leu Thr Ala Arg Gin Lys Lys Ala Glu Glu 

335 340 345 
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Leu 


Lys 


Arg 


Leu 


Thr 


Asp 


Leu 


Ala 


Ser 


Gin 


Met 


Ala 


Glu 


Met 


Gin 






350 










355 










360 


Leu 


Ala 


Glu 


Leu 


Arg 


Ala 


Glu 


He 


Lys 


His 


Phe 


Val 


Ser 


Glu 


Arg 










365 










370 










375 


Lys 


Tyr 


Asp 


Glu 


Glu 


Leu 


Gly 


Lys 


Ala 


Ala 


Arg 


Phe 


Ser 


Cys 


Asp 




380 










385 










390 


He 


Glu 


Gin 


Leu 


Lys 


Ala 


Gin 


He 


Met 


Leu 


Cys 


Gly 


Glu 


He 


Thr 










395 










400 










405 


His 


Pro 


Lys 


Asn 


Asn 


Tyr 


Ser 


Ser 


Arg 


Thr 


Pro 


Cys 


Ser 


Ser 


Leu 








410 










415 










420 


Leu 


Pro 


Leu 


Leu 


Asn 


Ala 


His 


Ala 


Ala 


Thr 


Ser 


Gly 


Lys 


Gin 


Ser 










425 










430 










435 


Asn 


Phe 


Ser 


Arg 


Lys 


Ser 


Ser 


Thr 


His 


Asn 


Lys 


Pro 


Ser 


Glu 


Gly 








440 










445 










450 


Lys 


Ala 


Ala 


Asn 


Pro 


Lys 


Met 


Val 


Ser 


Ser 


Leu 


Pro 


Ser 


Thr 


Ala 








455 










460 










4 65 


Asp 


Pro 


Ser 


His 


Gin 


Thr 


Met 


Pro 


Ala 


Asn 


Lys 


Gin 


Asn 


Gly 


Ser 








470 










475 










480 


Ser 


Asn 


Gin 


Arg 


Arg 


Arg 


Phe 


Asn 


Pro 


Gin 


Tyr 


His 


Asn 


Asn 


Arg 








485 










490 










495 


Leu 


Asn 


Gly 


Pro 


Ala 


Lys 


Ser 


Gin 


Gly 


Ser 


Gly 


Asn 


Glu 


Ala 


Glu 








500 










505 










510 


Pro 


Leu 


Gly 


Lys 


Gly 


Asn 


Ser 


Arg 


His 


Glu 


His 


Arg 


Arg 


Gin 


Pro 






515 










520 










525 


His 


Asn 


Gly 


Phe 


Arg 


Pro 


Lys 


Asn 


Lys 


Gly 


Gly 


Ala 


Lys 


Asn 


Gin 








530 










535 










540 


Glu 


Ala 


Ser 


Leu 


Gly 


Met 


Lys 


Thr 


Pro 


Glu 


Ala 


Pro 


Ala 


His 


Ser 










545 










550 










555 


Glu 


Lys 


Pro 


Arg 


Arg 


Arg 


Gin 


His 


Ala 


Ala 


Asp 


Thr 


Ser 


Glu 


Ala 






560 










565 










570 


Arg 


Pro 


Phe 


Arg 


Gly 


Ser 


Val 


Gly Arg 


Val 


Ser 


Gin 


Cys 


Asn 


Leu 








575 










580 










585 


Cys 


Pro 


Thr 


Arg 


He 


Glu 


Val 


Ser 


Thr 


Asp 


Ala 


Ala 


Val 


Leu 


Ser 






590 










595 










600 



Val Pro Ala Val Thr Leu Val Ala 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TESTTUT02 

(B) CLONE: 1273453 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 8 : 

Met Val He Ser Trp His Leu Ala Ser Asp Met Asp Cys Val Val 
5 10 15 

Thr Leu Thr Thr Asp Ala Ala Arg Arg He Tyr Asp Glu Thr Gin 

20 25 30 

Gly Arg Gin Gin Val Leu Pro Leu Asp Ser He Tyr Lys Lys Thr 

35 40 45 

Leu Pro Asp Trp Lys Arg Ser Leu Pro His Phe Arg Asn Gly Lys 



605 



(2) INFORMATION FOR SEQ ID NO: 



8 
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r r 
0 0 










oU 


Leu 


Tyr 


irne 


Lys 


Pro 


He 


Gly Asp 


Pro 


Vdl 


rile 


/ax a 


Arg 


Asp 


Leu 










OO 










"7 n 










H R 


Leu 


Thr 


Phe 


Pro 


Asp 

o U 


Asn 


Val 


Glu 


His 


Cys 

o r 
O D 


Glu 


Thr 


Val 


Phe 


Gly 

n n 


Met 


Leu 


Leu 


Gly 


Asp 
95 


Thr 


He 


He 


Leu 


Asp 

inn 
1UU 


Asn 


Leu 


Asp 


Ala 


Ala 

1U D 


Asn 


His 


Tyr 


Arg 


Lys 
110 


Glu 


Val 


Val 


Lys 


He 

1 1 R 


Thr 


His 


Cys 


Pro 


Thr 

ion 
1 Z U 


Leu 


Leu 


Thr 


Arg 


Asp 


Gly Asp 


Arg 


He 


Arg 


Ser 


Asn 


Gly 


Lys 


Phe 










125 










ion 
loll 












Gly 


Gly 


Leu 


Gin 


Asn 
140 


Lys 


Ala 


Pro 


Pro 


Met 

1 A R 

14 D 


Asp 


Lys 


Leu 


Arg 


Gly 

i c n 


Met 


Val 


Phe 


Gly Ala 


Pro 


Val 


Pro 


Lys 


/IT „ 

Gin 


Cys 


Leu 


lie 


Leu 


Gly 










155 










i r n 
IDU 










ICR 

1 DO 


Glu 


Gin 


He 


Asp 


Leu 
170 


Leu 


Gin 


Gin 


Tyr 


Arg 
175 


Ser 


Ala 


Val 


Cys 


Lys 

ion 
ibU 


Leu 


Asp 


Ser 


Val 


Asn 
185 


Lys 


Asp 


Leu 


Asn 


Ser 
1 90 


Gin 


Leu 


Glu 


Tyr 


Leu 

i n c 
1 3d 


Arg 


Thr 


Pro 


Asp 


Met 
200 


Arg 


Lys 


Lys 


Lys 


Gin 
205 


Glu 


Leu 


Asp 


Glu 


His 
210 


Glu 


Lys 


Asn 


Leu 


Lys 


Leu 


He 


Glu 


Glu 


Lys 


Leu 


Gly Met 


Thr 


Pro 










215 










220 










225 


He 


Arg 


Lys 


Cys 


Asn 
230 


Asp 


Ser 


Leu 


Arg 


His 
235 


Ser 


Pro 


Lys 


Val 


Glu 
240 


Thr 


Thr 


Asp 


Cys 


Pro 


Val 


Pro 


Pro 


Lys 


Arg 


Met 


Arg Arg 


Glu 


Ala 










245 










250 










255 


Thr 


Arg 


Gin 


Asn 


Arg 
260 


He 


He 


Thr 


Lys 


Thr 
265 


Asp 


Val 









(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 285 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TESTTUT02 

(B) CLONE: 1275261 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 : 



Met 


Val 


Met 


Arg 


Pro 
5 


Leu 


Trp 


Ser 


Leu 


Leu 
10 


Leu 


Trp 


Glu 


Ala 


Leu 
15 


Leu 


Pro 


He 


Thr 


Val 
20 


Thr 


Gly 


Ala 


Gin 


Val 
25 


Leu 


Ser 


Lys 


Val 


Gly 
30 


Gly 


Ser 


Val 


Leu 


Leu 
35 


Val 


Ala 


Ala 


Arg 


Pro 
40 


Pro 


Gly 


Phe 


Gin 


Val 
45 


Arg 


Glu 


Ala 


He 


Trp 
50 


Arg 


Ser 


Leu 


Trp 


Pro 
55 


Ser 


Glu 


Glu 


Leu 


Leu 
60 


Ala 


Thr 


Phe 


Phe 


Arg 
65 


Gly 


Ser 


Leu 


Glu 


Thr 
70 


Leu 


Tyr 


His 


Ser 


Arg 
75 


Phe 


Leu 


Gly 


Arg 


Ala 
80 


Gin 


Leu 


His 


Ser 


Asn 
85 


Leu 


Ser 


Leu 


Glu 


Leu 
90 


Gly 


Pro 


Leu 


Glu 


Ser 
95 


Gly 


Asp 


Ser 


Gly 


Asn 
100 


Phe 


Ser 


Val 


Leu 


Met 
105 
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Val 


Asp 


Thr 


Arg 


Gly 


Gin 


Pro 


Trp 








110 








Val 


Tyr 


Asp 


Ala 


Val 


Pro 


Arg 


Pro 










125 








Val 


Glu 


Arg 


Asp 


Ala 


Gin 


Pro 


Ser 










140 








Ser 


Cys 


Trp 


Ala 


Pro 


Asn 


He 


Ser 










155 








Arg 


Glu 


Thr 


Thr 


Met 


Asp 


Phe 


Gly 








170 








Thr 


Asp 


Gly 


Gin 


Val 


Leu 


Ser 


He 










185 








Asp 


Val 


Ala 


Tyr 


Ser 


Cys 


He 


Val 








200 








Leu 


Ala 


Thr 


Val 


Thr 


Pro 


Trp 


Asp 










215 








Pro 


Gly 


Lys 


Ala 


Ser 


Tyr 


Lys 


Asp 










230 








Val 


Ser 


Leu 


Leu 


Leu 


Met 


Leu 


Val 










245 








Trp 


Cys 


Pro 


Cys 


Ser 


Gly 


Lys 


Lys 










260 








Arg 


Val 


Gly 


Pro 


Glu 


Thr 


Glu 


Asn 








275 









m 



Thr 


Gin 


Thr 


Leu 


Gin 


Leu 


Lys 




115 










120 


Val 


Val 


Gin 


Val 


Phe 


He 


Ala 




130 










135 


Lys 


Thr 


Cys 


Gin 


Val 


Phe 


Leu 


145 










150 


Glu 


He 


Thr 


Tyr 


Ser 


Trp 


Arg 




160 








165 


Met 


Glu 


Pro 


His 


Ser 


Leu 


Phe 




175 










180 


Ser 


Leu 


Gly 


Pro 


Gly 


Asp 


Arg 




190 










195 


Ser 


Asn 


Pro 


Val 


Ser 


Trp 


Asp 




205 










210 


Ser 


Cys 


His 


His 


Glu 


Ala 


Ala 




220 










225 


Val 


Leu 


Leu 


Val 


Val 


Val 


Pro 




235 










240 


Thr 


Leu 


Phe 


Ser 


Ala 


Trp 


His 




250 










255 


Lys 


Lys 


Asp 


Val 


His 


Ala 


Asp 


265 










270 


Pro 


Leu 


Val 


Gin 


Asp 


Leu 


Pro 




280 










285 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNNOTl 6 

(B) CLONE: 1281682 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 : 



Met 


Pro 


Phe 


Thr 


Arg 
5 


Pro 


Leu 


Lys 


His 


Phe 
10 


Val 


Ser 


Leu 


Leu His 
15 


Pro 


Ser 


Ala 


Ser 


Gin 


Val 


His 


Asn 


Ala 


Gly 


Gin 


His 


Gin 


Lys Leu 






20 










25 








30 


Lys 


Thr 


Leu 


Glu 


Lys 


Ala 


Cys 


Gly 


Leu 


Ala 


Leu 


Gly 


Glu 


Gly Arg 








35 










40 








45 


Glu 


Gin 


Asn 


Leu 


Cys 


Thr 


Ser 


Leu 


Phe 


Asn 


Leu 


Glu 


He 


Arg His 








50 










55 








60 


Pro 


Arg 


Asp 


Ala 


He 


He 


Phe 


Cys 


Val 


Ser 


He 


Val 


Val 


Pro Leu 






65 










70 








75 



Ser 



(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 14 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT07 

(B) CLONE: 1298305 





(xi) SEQUENCE DESCRIPTION: SEQ 


ID NO: 11 : 








Met 


Thr 


Ala 


Ser 


Thr 


Gly 


His 


Leu Gly 


Leu 


Gly 


Trp 


Ser 


Ala 


Arg 










5 








10 










15 


Jr J_ O 


Cys 


Pro 


Cys 


Gly 


Thr 


Leu 


Gly Ser 


Cys 


Phe 


Leu 


Ser 


Leu 


rile 










20 








25 










30 


Ala 


Ala 


Leu 


Leu 


Trp 


Leu 


Ala 


Ala Ala 


Val 


Leu 


Gin 


Ala 


Cys 


Val 










35 








40 










45 


Gly 


His 


Ser 


Asp 


Glu 


Gly 


Cys 


Gly Ala 


Ser 


Gin 


Cys 


Arg Arg 


Ala 










50 








55 










60 


Ala 


Leu 


Gly 


He 


Val 


Pro 


Ser 


Pro Val 


Ser 


Val 


Leu 


Arg 


Thr 


Tyr 










65 








70 










75 


Pro 


Gly 


Leu 


His 


His 


Gin 


Asp 


Pro Val 


Phe 


Gly 


Phe 


Arg 


Arg 


Pro 










80 








85 










90 


Ser 


Met 


Gly 


Lys 


Thr 


Arg 


His 


Gin Pro 


Leu 


Gin 


Gin 


Trp 


Val 


Pro 










95 








100 










105 


Leu 


Ala 


Cys 


Gly 


His 


Gin 


Leu 


Gly Asp 


Pro 


Gly 


Ser 


Gly 


Pro 


Leu 










110 








115 










120 


Leu 


Ser 


Pro 


Val 


Ser 


Leu 


Cys 


Cys Gly 


Phe 


Trp 


Ala 


Val 


Met 


Ser 










125 








130 










135 


Pro 


Pro 


Leu 


Lys 


Asp 


Val 


Phe 


Thr Leu 


Thr 


Ser 


Gly 
















140 








145 














(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 261 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT12 

(B) CLONE: 1360501 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 : 



Met 


Glu 


Leu 


Leu 


Gin 
5 


Val 


Thr 


He 


Leu 


Phe 
10 


Leu 


Leu 


Pro 


Ser 


He 
15 


Cys 


Ser 


Ser 


Asn 


Ser 
20 


Thr 


Gly 


Val 


Leu 


Glu 
25 


Ala 


Ala 


Asn 


Asn 


Ser 

30 


Leu 


Val 


Val 


Thr 


Thr 
35 


Thr 


Lys 


Pro 


Ser 


He 
40 


Thr 


Thr 


Pro 


Asn 


Thr 

45 


Glu 


Ser 


Leu 


Gin 


Lys 
50 


Asn 


Val 


Val 


Thr 


Pro 
55 


Thr 


Thr 


Gly 


Thr 


Thr 

60 


Pro 


Lys 


Gly 


Thr 


He 
65 


Thr 


Asn 


Glu 


Leu 


Leu 
70 


Lys 


Met 


Ser 


Leu 


Met 
75 


Ser 


Thr 


Ala 


Thr 


Phe 
80 


Leu 


Thr 


Ser 


Lys 


Asp 
85 


Glu 


Gly 


Leu 


Lys 


Ala 
90 
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Thr 


Thr 


Thr 


Asp 


Val 
95 


Arq 


Lvs 


Asn 


Asp 


Ser 
100 


He 


He 


Ser 


Asn 


Val 
105 


Thr 


Val 


Thr 


Ser 


Val 
110 


Thr 


Leu 


Pro 


Asn 


Ala 
115 


Val 


Ser 


Thr 


Leu 


Gin 
120 


Ser 


Ser 


Lys 


Pro 


Lys 
125 


Thr 


Glu 


Thr 


Gin 


Ser 
130 


Ser 


He 


Lys 


Thr 


Thr 
135 


Glu 


He 


Pro 


Gly 


Ser 
140 


Val 


Leu 


Gin 


Pro 


Asp 
145 


Ala 


Ser 


Pro 


Ser 


Lys 
150 


Thr 


Gly 


Thr 


Leu 


Thr 


Ser 


He 


Pro 


Val 


Thr 


lie 


Pro 


Glu 


Asn 


Thr 








155 










160 










165 


Ser 


Gin 


Ser 


Gin 


Val 
170 


He 


Gly 


Thr 


Glu 


Gly 
175 


Gly 


Lys 


Asn 


Ala 


Ser 
180 


Thr 


Ser 


Ala 


Thr 


Ser 

185 


Arg 


Ser 


Tyr 


Ser 


Ser 
190 


He 


He 


Leu 


Pro 


Val 
195 


Val 


He 


Ala 


Leu 


He 
200 


Val 


He 


Thr 


Leu 


Ser 
205 


Val 


Phe 


Val 


Leu 


Val 
210 


Gly 


Leu 


Tyr 


Arg 


Met 
215 


Cys 


Trp 


Lys 


Ala 


Asp 
220 


Pro 


Gly 


Thr 


Pro 


Glu 
225 


Asn 


Gly 


Asn 


Asp 


Gin 

230 


Pro 


Gin 


Ser 


Asp 


Lys 
235 


Glu 


Ser 


Val 


Lys 


Leu 
240 


Leu 


Thr 


Val 


Lys 


Thr 
245 


He 


Ser 


His 


Glu 


Ser 
250 


Gly 


Glu 


His 


Ser 


Ala 
255 


Gin 


Gly 


Lys 


Thr 


Lys 
260 


Asn 





















(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT12 

(B) CLONE: 1362406 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 : 



Met 


Ala 


Gly 


Cys 


Pro 
5 


Ala 


Asp 


Arg 


Ser 


He 
10 


Leu 


Ala 


Pro 


Leu 


Ala 
15 


Trp 


Asp 


Leu 


Gly 


Leu 
20 


Leu 


Leu 


Leu 


Phe 


Val 
25 


Gly 


Gin 


His 


Ser 


Leu 
30 


Met 


Ala 


Ala 


Glu 


Arg 
35 


Val 


Lys 


Ala 


Trp 


Thr 
40 


Ser 


Arg 


Tyr 


Phe 


Gly 
45 


Val 


Leu 


Gin 


Arg 


Ser 

50 


Leu 


Tyr 


Val 


Ala 


Cys 
55 


Thr 


Ala 


Leu 


Ala 


Leu 
60 


Gin 


Leu 


Val 


Met 


Arg 
65 


Tyr 


Trp 


Glu 


Pro 


He 
70 


Pro 


Lys 


Gly 


Pro 


Val 
75 


Leu 


Trp 


Glu 


Ala 


Arg 
80 


Ala 


Glu 


Pro 


Trp 


Ala 
85 


Thr 


Trp 


Val 


Pro 


Leu 
90 


Leu 


Cys 


Phe 


Val 


Leu 
95 


His 


Val 


He 


Ser 


Trp 
100 


Leu 


Leu 


He 


Phe 


Ser 
105 


He 


Leu 


Leu 


Val 


Phe 
110 


Asp 


Tyr 


Ala 


Glu 


Leu 
115 


Met 


Gly 


Leu 


Lys 


Gin 
120 


Val 


Tyr 


Tyr 


His 


Val 
125 


Leu 


Gly 


Leu 


Gly 


Glu 
130 


Pro 


Leu 


Ala 


Leu 


Lys 
135 


Ser 


Pro 


Arg 


Ala 


Leu 


Arg 


Leu 


Phe 


Ser 


His 


Leu 


Arg 


His 


Pro 


Val 



130 
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14 0 








Cys Val 


bXU 


Leu 


Leu 
155 


Thr 


vai 


Leu 


Thr Asp 


Arg 


Leu 


Leu 
170 


Leu 


Ala 


Phe 


Leu Ala 


His 


Gly 


Leu 
185 


Asp 


Gin 


Gin 


Gin Leu 


Gin 


Arg 


Lys 
200 


Leu 


His 


Leu 


Glu Ala 


Glu 
















1 4 R 










1 Rf) 
i j u 


Trp 


Val 


Val 


Pro 


Thr 


Leu 


Gly 




160 










165 


Leu 


Leu 


Thr 


Leu 


Tyr 


Leu 


Gly 




175 










180 


Asp 


Leu 


Arg 


Tyr 


Leu 


Arg 


Ala 




190 










195 


Leu 


Ser 


Arg 


Pro 


Gin 


Asp 


Gly 




205 










210 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LATRTUT02 

(B) CLONE; 1405329 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 : 

Met Gin Pro Arg Pro Arg Gly Arg Pro Pro Arg Thr Arg Gly Asp 
5 10 15 

Glu Ala Pro Gin Trp His Leu Pro Asp Ala Ala Ala Leu Leu Pro 

20 25 30 

Val Arg Leu Pro Leu Ala Val Leu Val Arg Gly Thr Gin Arg Pro 

35 40 45 

Glu Arg Arg Arg Cys Gly Arg Leu Pro Ala Gly Val Pro Gly Ala 

50 55 60 

Ala Arg Ser Val Ala Arg Ser 

65 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAINOT12 

(B) CLONE : 1415223 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 : 

Met Leu Ala Pro Gin Arg Thr Arg Ala Pro Ser Pro Arg Ala Ala 
5 10 15 

Pro Arg Pro Thr Arg Ser Met Leu Pro Ala Ala Met Lys Gly Leu 
20 25 30 
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Gly 


Leu 


Ala 


Leu 


Leu 


Ala 


Val 


Leu 


Leu 


Cys 


Ser 


Ala 


Pro 


Ala 


His 
































\j -L y 






Cys 


Gin 


Asp 


Cys 


Thr 


Leu 


Thr 


Thr 


Asn 


Ser 


Ser 


His 










o u 










-J o 










D U 




Thy 

JL us. 




Lys 


Gin 


Cys 


Gin 


Pro 


Ser 


Asp 


Thr 


Val 


Cys 


Ala 


Ser 




















/ u 










/ 3 


Val 


Arg 


Tip, 
J_ J_ fc! 


1 Il-L 




t -L vj 


OCI 


C ^ y~ 
OcJ. 


C ^ y- 


A t rr 


.Lys 




Hi s 


Ser 


Val 




















O 3 












Asn 


Lys 




y 










Hop 


it lie 


V a. J_ 


Lys 


rlx. y 


ulb 


XT I Ifcr 










95 










100 










105 


Phe 


Ser 


Asp 


Tyr 


Leu 


Met 


Gly 


Phe 


lie 


Asn 


Ser 


Gly 


He 


Leu 


Lys 










i i n 

11U 




















i on 

1ZU 


Val 


Asp 


Val 


Asp 


Cys 


Cys 


Glu 


Lys 


Asp 


Leu 


Cys 


Asn 


Gly 


Ala 


Ala 










125 










130 










135 


Gly Ala 


Gly 


His 


Ser 


Pro 


Trp 


Ala 


Leu 


Ala 


Gly 


Gly 


Leu 


Leu 


Leu 










140 










145 










150 


Ser 


Leu 


Gly 


Pro 


Ala 


Leu 


Leu 


Trp 


Ala 


Gly 


Pro 


















155 










160 












(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 




16: 













(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 141 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAINOT12 

(B) CLONE: 1416553 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 : 



Met 


Trp 


Ala 


Gin 


Arg 
5 


Val 


Leu 


Thr Leu 


Trp 
10 


Gin 


Gly 


Leu 


Ser 


Trp 
15 


Gly Arg 


Pro 


Pro 


Ser 


Gly 


Pro 


Gly Ala 


Met 


Ala 


Pro 


Arg 


Gly 


Gin 










20 








25 










30 


Ala 


Asp 


Leu 


Leu 


Pro 
35 


Ala 


Val 


Ser Thr 


Pro 
40 


Phe 


Leu 


He 


Thr 


Val 
45 


Trp 


Ser 


Pro 


Ser 


Phe 
50 


Gly 


Cys 


Ser Leu 


Arg 
55 


Cys 


Val 


Leu 


Gly 


Ser 
60 


Ser 


Glu 


Pro 


Glu 


Ala 
65 


Ser 


Phe 


Trp Lys 


Pro 
70 


Ala 


Val 


Leu 


Pro 


Ala 
75 


Pro 


Val 


Gin 


Lys 


Pro 


Leu 


Ser 


Pro Ala 


Phe 


Pro 


Gin 


Ala 


Gly Val 










80 








85 










90 


Gly 


Val 


Gly 


Gly 


Leu 
95 


Cys 


Pro 


Ser Ser 


Leu 
100 


Thr 


Leu 


Glu 


Arg 


Trp 
105 


Glu 


Ala 


Gly 


Asn 


Leu 
110 


His 


Leu 


Gly Ala 


Trp 
115 


Ala 


Pro 


Pro 


Leu 


Cys 
120 


Ala 


Ser 


Gly 


Phe 


Pro 
125 


Ala 


Pro 


Gly Arg 


Gly 
130 


Cys 


Ser 


Pro 


Ser 


Trp 
135 


Thr 


Pro 


Ala 


Cys 


Pro 
140 


Ser 



















(2) INFORMATION FOR SEQ ID NO: 17: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: KIDNNOTO 9 

(B) CLONE: 1418517 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 : 



Met 


Glu 


Asp 


Glu 


Glu 


Val 


Ala 


Glu 


Ser 


Trp 


Glu 


Glu 


Ala 


Ala 


Asp 










5 










10 










15 


Ser 


Gly 


Glu 


He 


Asp 


Arg 


Arg 


Leu 


Glu 


Lys 


Lys 


Leu 


Lys 


He 


Thr 




















9 ^ 










3 U 


Gin 


Lys 


Glu 


Ser 


Arg 


Lys 


Ser 


Lys 


Ser 


Pro 


Pro 


Lys 


Val 


Pro 


He 










35 










40 










45 


Val 


He 


Gin 


Asp 


Asp 


Ser 


Leu 


Pro 


Ala 


Gly 


Pro 


Pro 


Pro 


Gin 


He 










50 










55 










60 


Arg 


He 


Leu 


Lys 


Arg 


Pro 


Thr 


Ser 


Asn 


Gly 


Val 


Val 


Ser 


Ser 


Pro 










65 










70 










75 


Asn 


Ser 


Thr 


Ser 


Arg 


Pro 


Thr 


Leu 


Pro 


Val 


Lys 


Ser 


Leu 


Ala 


Gin 










80 










85 










90 


Arg 


Glu 


Ala 


Glu 


Tyr 


Ala 


Glu 


Ala 


Arg 


Lys 


Arg 


He 


Leu 


Gly 


Ser 










95 










100 










105 


Ala 


Ser 


Pro 


Glu 


Glu 


Glu 


Gin 


Glu 


Lys 


Pro 


He 


Leu 


Asp 


Arg 


Pro 










110 










115 










120 


Thr 


Arg 


He 


Ser 


Gin 


Pro 


Glu 


Asp 


Ser 


Arg 


Gin 


Pro 


Asn 


Asn 


Val 










125 










130 










135 


He 


Arg 


Gin 


Pro 


Leu 


Gly 


Pro 


Asp 


Gly 


Ser 


Gin 


Gly 


Phe 


Lys 


Gin 










140 










145 










150 


Arg 


Arg 




























(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 




18: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 742 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PANCNOT08 

(B) CLONE: 1438165 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 : 



Met 


Ala 


Ser 


Val 


His 


Glu 


Ser 


Leu 


Tyr 


Phe 


Asn 


Pro 


Met 


Met 


Thr 










5 










10 










15 


Asn 


Gly 


Val 


Val 


His 


Ala 


Asn 


Val 


Phe 


Gly 


He 


Lys 


Asp 


Trp 


Val 










20 










25 










30 


Thr 


Pro 


Tyr 


Lys 


He 


Ala 


Val 


Leu 


Val 


Leu 


Leu 


Asn 


Glu 


Met 


Ser 










35 










40 










45 


Arg 


Thr 


Gly 


Glu 


Gly Ala 


Val 


Ser 


Leu 


Met 


Glu 


Arg 


Arg 


Arg 


Leu 










50 










55 










60 


Asn 


Gin 


Leu 


Leu 


Leu 


Pro 


Leu 


Leu 


Gin 


Gly 


Pro 


Asp 


He 


Thr 


Leu 
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Ser Lys Leu 
Asn Ser Val 
Lys Asp Met 
Gly Thr Glu 
Leu Arg His 
Val Phe Lys 
Glu Lys Lys 
Asp Glu Gly 
Val Arg Glu 
Gin Ala Glu 
Asp Glu Thr 
Asn Asn Leu 
Leu Ser Tyr 
Thr His Ser 
Ala Glu Ser 
Arg Tyr Ala 
His Tyr Gin 
Ala Gin Glu 
Trp Leu Tyr 
Leu Glu His 
Ala Phe Ala 
Asp Ser Asp 
Asp lie Ser 
Arg Ser Thr 
Asn Ser Leu 
Glu Ser Phe 
Glu Gin Gly 
Lys Glu Arg 
Leu Cys Asp 
Lys Tyr His 



65 

Tyr Lys Leu 
80 

Gin lie Arg 
95 

Glu Gin Phe 
110 

Pro Glu Val 
125 

Met lie Leu 
140 

Leu Tyr Thr 
155 

Thr Val Glu 
170 

Glu Arg Lys 

185 
Glu Glu Val 

200 
Phe Phe Leu 

215 
Lys Ala Leu 

230 
Leu Lys Phe 

245 
Leu Asn Asn 

260 
Leu Leu His 

275 

Lys Ser Asn 

290 
Ala Leu Asn 

305 
Gin Ala Glu 

320 
Ser Asn Asp 

335 

Val Leu Gly 

350 
Ser Val Lys 

365 

Gly Lys Thr 

380 
Leu Leu His 

395 
lie Ala Gin 

410 
Met Ala Leu 

425 
Glu Ala Val 

440 
Ala Val Ala 

455 

Cys Phe Ala 

470 
Phe Pro Pro 

485 
Gin Lys lie 

500 

Leu Ala Asp 
515 



lie Glu Glu 
lie Lys Leu 
Phe Asp Asp 
His Lys Thr 
Ala Tyr Ser 
Ala Leu Gin 
Asp Ala Asp 
Met Glu Lys 
Ser Cys Ser 
Ser Gin Gin 
Thr Pro Ala 
Asn Pro Asp 
Leu Arg Val 
Tyr Phe Asp 
Gly Glu Glu 
Leu Ala Ala 
Leu Ala Leu 
His Val Cys 
Gin Lys Arg 
Lys Ala Val 
Ala Asn Lys 
Trp Lys His 
Lys Thr Ala 
Gin Gin Ala 
Asn Ala Gly 
Leu Cys His 
Ala Ala Ser 
Asn Ser Gin 
Gin Phe Asp 
Ser Leu Val 



70 






Ser 


Cys 


Pro 


85 






Met 


Ala 


Glu 


100 






Leu 


Ser 


Asp 


115 






Ser 


Val 


Val 


130 






Lys 


Leu 


Ser 


145 






Gin 


Tyr 


Phe 


160 






Met 


Glu 


Leu 


175 






Glu 


Glu 


Leu 


190 






Gly 


Pro 


Leu 


205 






Ala 


Ser 


Leu 


220 






Ser 


Leu 


Gin 


235 






Phe 


Ala 


Glu 


250 






Gin 


Asp 


Val 


265 






Arg 


Leu 


He 


280 






Gly 


Tyr 


Gly 


295 






Leu 


His 


Cys 


310 






Gin 


Glu 


Ala 


325 






Leu 


Gin 


His 


340 






Ser 


Asp 


Ser 


355 






His 


Phe 


Gly 


370 






Leu 


Met 


Asp 


385 






Ser 


Leu 


Ser 


400 






He 


Trp 


Arg 


415 






Gin 


Met 


Leu 


430 






Val 


Gin 


Gin 


445 






Leu 


Ala 


Glu 


460 






Glu 


Val 


Leu 


475 






His 


Ala 


Gin 


490 






Arg 


Ala 


Met 


505 






Thr 


Gly 


He 


520 











75 


Gin 


Leu 


Ala 






90 


Gly 


Glu 


Leu 






105 


Ser 


Phe 


Ser 






120 


Gly 


Leu 


Phe 






135 


Phe 


Ser 


Gin 






150 


Gin 


Asn 


Gly 






165 


Thr 


Ser 


Arg 






180 


Asp 


Val 


Ser 






195 


Ser 


Gin 


Lys 






210 


Leu 


Lys 


Asn 






225 


Lys 


Glu 


Leu 






240 


Ala 


His 


Tyr 






255 


Phe 


Ser 


Ser 






270 


Leu 


Thr 


Gly 






285 


Arg 


Ser 


Leu 






300 


Arg 


Phe 


Gly 






315 


He 


Arg 


He 






330 


Cys 


Leu 


Ser 






345 


Tyr 


Val 


Leu 






360 


Leu 


Pro 


Arg 






375 


Ala 


Leu 


Lys 






390 


Glu 


Leu 


He 






405 


Leu 


Tyr 


Gly 






420 


Leu 


Ser 


Met 






435 


Asn 


Asn 


Thr 






450 


Leu 


His 


Ala 






465 


Lys 


His 


Leu 






480 


Leu 


Trp 


Met 






495 


Asn 


Asp 


Gly 






510 


Thr 


Ala 


Leu 






525 
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Asn 


Ser 


He 


Glu 


Gly 


Val 


Tyr 


Arg 


Lys 


Ala 


Val 


Val 


Leu 


Gin 


Ala 










530 










535 










540 


Gin 


Asn 


Gin 


Met 


Ser 


Glu 


Ala 


His 


Lys 


Leu 


Leu 


Gin 


Lys 


Leu 


Leu 










545 










550 










555 


Val 


His 


Cys 


Gin 


Lys 


Leu 


Lys 


Asn 


Thr 


Glu 


Met 


Val 


He 


Ser 


Val 










560 










565 










570 


Leu 


Leu 


Ser 


Val 


Ala 


Glu 


Leu 


Tyr 


Trp 


Arg 


Ser 


Ser 


Ser 


Pro 


Thr 










575 










580 










585 


He 


Ala 


Leu 


Pro 


Met 


Leu 


Leu 


Gin 


Ala 


Leu 


Ala 


Leu 


Ser 


Lys 


Glu 










590 










595 










600 


Tyr 


Arg 


Leu 


Gin 


Tyr 


Leu 


Ala 


Ser 


Glu 


Thr 


Val 


Leu 


Asn 


Leu 


Ala 








605 










610 










615 


Phe 


Ala 


Gin 


Leu 


He 


Leu 


Gly 


He 


Pro 


Glu 


Gin 


Ala 


Leu 


Ser 


Leu 










620 










625 










630 


Leu 


His 


Met 


Ala 


He 


Glu 


Pro 


He 


Leu 


Ala 


Asp 


Gly 


Ala 


He 


Leu 










635 










640 










645 


Asp 


Lys 


Gly 


Arg 


Ala 


Met 


Phe 


Leu 


Val 


Ala 


Lys 


Cys 


Gin 


Val 


Ala 










650 










655 










660 


Ser 


Ala 


Ala 


Ser 


Tyr 


Asp 


Gin 


Pro 


Lys 


Lys 


Ala 


Glu 


Ala 


Leu 


Glu 










665 










670 










675 


Ala 


Ala 


He 


Glu 


Asn 


Leu 


Asn 


Glu 


Ala 


Lys 


Asn 


Tyr 


Phe 


Ala 


Lys 










680 










685 










690 


Val 


Asp 


Cys 


Lys 


Glu 


Arg 


He 


Arg 


Asp 


Val 


Val 


Tyr 


Phe 


Gin 


Ala 










695 










700 










705 


Arg 


Leu 


Tyr 


His 


Thr 


Leu 


Gly 


Lys 


Thr 


Gin 


Glu 


Arg 


Asn 


Arg 


Cys 










710 










715 










720 


Ala 


Met 


Leu 


Phe 


Arg 


Gin 


Leu 


His 


Gin 


Glu 


Leu 


Pro 


Ser 


His 


Gly 










725 










730 










735 


Val 


Pro 


Leu 


He 


Asn 


His 


Leu 


























740 






















(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 




19: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 805 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THYRNOT03 

(B) CLONE: 1440381 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 : 



Met 


Asp 


Gly 


He 


Leu 


Asp 


Glu 


Ser 


Leu 


Leu 


Glu 


Thr 


Cys 


Pro 


He 










5 










10 










15 


Gin 


Ser 


Pro 


Leu 


Gin 


Val 


Phe 


Ala 


Gly 


Met 


Gly 


Gly 


Leu 


Ala 


Leu 










20 










25 










30 


He 


Ala 


Glu 


Arg 


Leu 


Pro 


Met 


Leu 


Tyr 


Pro 


Glu 


Val 


He 


Gin 


Gin 








35 










40 










45 


Val 


Ser 


Ala 


Pro 


Val 


Val 


Thr 


Ser 


Thr 


Thr 


Gin 


Glu 


Lys 


Pro 


Tyr 










50 










55 










60 


Asp 


Ser 


Asp 


Gin 


Phe 


Glu 


Trp 


Val 


Thr 


He 


Glu 


Gin 


Ser 


Gly 


Glu 










65 










70 










75 


Leu 


Val 


Tyr 


Glu 


Ala 


Pro 


Glu 


Thr 


Val 


Ala 


Ala 


Glu 


Pro 


Pro 


Pro 










80 










85 










90 


He 


Lys 


Ser 


Ala 


Val 


Gin 


Thr 


Met 


Ser 


Pro 


He 


Pro 


Ala 


His 


Ser 
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Leu Ala Ala 
Val Leu Leu 
Val Leu Gly 
Ser Pro Ser 
Arg Ser Leu 
Leu Leu Arg 
Leu Val Cys 
Asn Ser Ser 
Asn Pro Thr 
Thr Gly Phe 
Gin Ala Leu 
Leu Leu Gin 
Val Asn Gly 
Ser Asn Ala 
Cys Leu lie 
Leu Asp Met 
Leu Leu Arg 
Leu Pro Leu 
Glu Cys Gin 
Cys Val Asp 
Val Lys Thr 
Gly Leu Thr 
Val Tyr Ala 
Asn Trp Val 
Leu Ser Val 
Lys Lys Leu 
Asp Gly Lys 
Val Lys Asn 
Leu Ala Gin 
Ser Ser Ser 





95 




Phe 


Gly 


Leu 




110 




Lys 


Glu 


Arg 




125 




Val 


Thr 


Asp 




140 




Ala 


Asn 


Val 




155 




Phe 


Ser 


Thr 




170 




Arg 


Met 


Ala 




185 




Leu 


Ser 


Ala 




200 




Val 


Asn 


Gin 




215 




Ser 


Thr 


Glu 




230 




Gly 


Thr 


Gly 




245 




Thr 


Lys 


Gin 




260 




Val 


Leu 


Ala 




275 




Glu 


Ala 


Gin 




290 




Leu 


Pro 


Ser 




305 




Pro 


Ala 


Met 




320 




Ala 


Arg 


His 




335 




Ala 


He 


Ala 




350 




Ser 


Thr 


Glu 




365 




Thr 


Ser 


Val 




380 




Thr 


Tyr 


Thr 




395 




Gly 


Val 


Lys 




410 




Leu 


Leu 


Val 




425 




Ala 


Thr 


Thr 




440 




Asn 


Thr 


Pro 




455 




Leu 


Lys 


Ser 




470 




Gin 


Phe 


Asp 




485 




Leu 


Gly 


Phe 




500 




Ala 


Asn 


Asp 




515 




Glu 


Ala 


Val 




530 




Ser 


Val 


Phe 




545 





Phe Leu Arg 
Lys His Ala 
Asp Gly Glu 
Leu Pro Thr 
Thr Pro Leu 
Leu Glu He 
Leu Ser His 
Thr Glu Pro 
Glu Gin Gin 
Ser Thr Ala 
Arg Leu Glu 
Ser Tyr He 
Ser Ser His 
Val Leu Leu 
Ser Ser Tyr 
Val Pro Leu 
Ser Cys Ala 
Asn Gly Glu 
Gly Thr Leu 
Asn Arg Leu 
Pro Asp Ala 
Pro Asp lie 
Ser Leu Arg 
Arg Arg Arg 
Leu Glu Glu 
Thr Phe Glu 
Lys Val Asn 
Ala Asn Ser 
Thr Leu Ser 
Val Arg Cys 



100 






Leu 


Pro 


Gly 


115 






Gin 


Cys 


Leu 


130 






Gly 


Ser 


His 


145 






Leu 


Pro 


Phe 


160 






Thr 


Thr 


Asp 


175 






Gly 


Ala 


Leu 


190 






His 


Ser 


Pro 


205 






Gin 


Val 


Ser 


220 






Leu 


Tyr 


Trp 


235 






Ser 


Gly 


Trp 


250 






Glu 


Glu 


His 


265 






Asn 


Pro 


Val 


280 






Glu 


Thr 


Arg 


295 






Glu 


Leu 


Leu 


310 






Leu 


Arg 


Asn 


325 






Tyr 


Arg 


Ala 


340 






Ala 


Met 


Val 


355 






Glu 


Glu 


Glu 


370 






Leu 


Ala 


Lys 


385 






Arg 


Ser 


Lys 


400 






Ser 


Asp 


Gin 


415 






Gin 


Lys 


Thr 


430 






Gin 


Ala 


Asn 


445 






Leu 


Met 


Asn 


460 






Lys 


Tyr 


Val 


475 






Met 


Val 


Ser 


490 






Tyr 


His 


Tyr 


505 






Ala 


Ala 


Arg 


520 






Thr 


Ser 


Leu 


535 






Asp 


Glu 


Glu 


550 











105 


Tyr 


Ala 


Glu 






120 


Leu 


Arg 


Leu 






135 


He 


Leu 


Gin 






150 


His 


Val 


Leu 






165 


Asp 


Gly 


Val 






180 


His 


Leu 


He 






195 


Arg 


Val 


Pro 






210 


Ser 


Ser 


His 






225 


Ala 


Lys 


Gly 






240 


Asp 


Val 


Glu 






255 


Val 


Thr 


Cys 






270 


Ser 


Ser 


Ala 






285 


Gly 


Gin 


Asn 






300 


Ser 


Gin 


Ser 






315 


Asp 


Ser 


Val 






330 


Leu 


Leu 


Glu 






345 


Pro 


Leu 


Leu 






360 


Glu 


Gin 


Ser 






375 


Met 


Lys 


Thr 






390 


Arg 


Glu 


Asn 






405 


Glu 


Pro 


Glu 






420 


Ala 


Glu 


He 






435 


Gin 


Glu 


Lys 






450 


Pro 


Lys 


Pro 






465 


Ala 


Val 


Met 






480 


Glu 


Asp 


Glu 






495 


Met 


Ser 


Gin 






510 


Ala 


Arg 


Arg 






525 


Pro 


Leu 


Ser 






540 


Arg 


Leu 


Asp 






555 
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He 


Met 


Lys 


Val 


Leu 


He 


Thr 


Gly 


Pro 


Ala 


Asp 


Thr 


Pro 


Tyr 


Ha 

-MX cl 








560 










565 










3 / U 


Asn 


Gly 


Cys 


Phe 


Glu 


Phe 


Asp 


Val 


Tyr 


Phe 


Pro 


Gin 


Asp 


Tyr 


Pro 






575 










580 










0 0 D 


Ser 


Ser 


Pro 


Pro 


Leu 


Val 


Asn 


Leu 


Glu 


Thr 


Thr 


Gly 


Gly 


His 


Ser 






590 










595 










DUU 


Val 


Arg 


Phe 


Asn 


Pro 


Asn 


Leu 


Tyr 


Asn 


Asp 


Gly 


Lys 


Val 


bys 


Leu 








605 










<:i A 
OlU 










615 


Ser 


He 


Leu 


Asn 


Thr 


Trp 


His 


Gly Arg 


Pro 


Glu 


Glu 


Lys 


Trp 


Asn 








620 








625 










630 


Pro 


Gin 


Thr 


Ser 


Ser 


Phe 


Leu 


Gin 


Val 


Leu 


Val 


Ser 


vai 


bin 


Qqv- 






635 










64 U 










£ A S 


Leu 


He 


Leu 


Val 


Ala 


Glu 


Pro 


Tyr 


Phe 


Asn 


Glu 


Pro 


Gly 


Tyr 


blU 








650 










r a c 
ODD 










660 


Arg 


Ser 


Arg 


Gly 


Thr 


Pro 


Ser 


Gly 


Thr 


Gin 


Ser 


Ser 


Arg 


blU 


Tyr 




665 










67 0 










67 5 


Asp 


Gly 


Asn 


He 


Arg 
680 


Gin 


Ala 


Thr 


Val 


Lys 
685 


Trp 


Ala 


Met 


Leu 


blU 

690 


Gin 


He 


Arg 


Asn 


Pro 


Ser 


Pro 


Cys 


Phe 


Lys 


Glu 


Val 


He 


Hi s 


Lys 






695 










700 










7 n ^ 


His 


Phe 


Tyr 


Leu 


Lys 


Arg 


Val 


Glu 


He 


Met 


Ala 


Gin 


Cys 


blU 


blU 








710 










/ Id 










720 


Trp 


He 


Ala 


Asp 


He 


Gin 


Gin 


Tyr 


Ser 


Ser 


Asp 


Lys 


Arg 


Val 


biy 






725 










730 










i "3 r 


Arg 


Thr 


Met 


Ser 


His 
740 


His 


Ala 


Ala 


Ala 


Leu 
745 


Lys 


Arg 


His 


Thr 


Ala 

7 


Gin 


Leu 


Arg 


Glu 


Glu 


Leu 


Leu 


Lys 


Leu 


Pro 


Cys 


Pro 


Glu 


biy 


Leu 








755 










760 










765 


Asp 


Pro 


Asp 


Thr 


Asp 


Asp 


Ala 


Pro 


Glu 


Val 


Cys 


Arg 


Ala 


Thr 


Thr 






770 










775 










780 


Gly 


Ala 


Glu 


Glu 


Thr 


Leu 


Met 


His 


Asp 


Gin 


Val 


Lys 


Pro 


Ser 


Ser 








785 










790 










795 


Ser 


Lys 


Glu 


Leu 


Pro 


Ser 


Asp 


Phe 


Gin 


Leu 


















800 










805 













(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGN0T14 

(B) CLONE: 1510839 

(xi) SEQUENCE DESCRIPTION: 

Met Lys Ala Ser Gin Cys Cys Cys 
5 

Ser Val Leu Leu Leu Leu Leu Leu 
20 

Ala Val Leu Leu Gin Ala Ala Glu 
35 

Pro Asp Pro Arg Pro Arg Thr Leu 
50 

Thr Pro Ala Gin Gin Pro Gly Arg 



SEQ ID NO: 20 : 



Cys 


Leu 


Ser 


His 


Leu 


Leu 


Ala 


10 










15 


Pro 


Glu 


Leu 


Ser 


Gly 


Pro 


Leu 




25 










30 


Ala 


Ala 


Pro 


Gly 


Leu 


Gly 


Pro 




40 










45 


Pro 


Pro 


Leu 


Pro 


Pro 


Gly 


Pro 




55 










60 


Gly 


Leu 


Ala 


Glu 


Ala 


Ala 


Gly 
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65 










70 










75 


Pro 


Arg 


Gly 


Ser 


Glu 


Gly 


Gly 


Asn 


Gly 


Ser 


Asn 


Pro 


Val 


Ala 


Gly 






80 










85 










90 


Leu 


Glu 


Thr 


Asp 


Asp 


His 


Gly 


Gly 


Lys 


Ala 


Gly 


Glu 


Gly 


Ser 


Val 








95 










100 










1 AC 
lUO 


Gly 


Gly 


Gly 


Leu 


Ala 


Val 


Ser 


Pro 


Asn 


Pro 


Gly Asp 


Lys 


Pro 


Met 






110 










115 










12 0 


Thr 


Gin 


Arg 


Ala 


Leu 


Thr 


Val 


Leu 


Met 


Val 


Val 


Ser 


Gly 


Ala 


Val 








125 










130 










IOC 

1 Jo 


Leu 


Val 


Tyr 


Phe 


Val 


Val 


Arg 


Thr 


Val 


Arg 


Met 


Arg 


Arg 


Arg 


Asn 








140 










145 










150 


Arg 


Lys 


Thr 


Arg 


Arg 


Tyr 


Gly 


Val 


Leu 


As P 


Thr 


Asn 


lie 


Glu 


Asn 






155 










160 










165 


Met 


Glu 


Leu 


Thr 


Pro 


Leu 


Glu 


Gin 


Asp 


Asp 


Glu 


Asp 


Asp 


Asp 


Asn 










17 0 










1 TC 

1 / o 










J_ O \J 


Thr 


Leu 


Phe 


Asp 


Ala 


Asn 


His 


Pro 


Arg 


Arg 


Arg 


Glu 


Cys 


Ala 


Phe 








185 










190 










195 


(2) 


INFORMATION 


FOR 


SEQ 


ID 


NO: 




21: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SPLNNOT04 

(B) CLONE: 1534876 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 : 



Met 


Trp 


Phe 


Leu 


Gly 


Cys 


Thr 


Gly 


Pro 


Gly 


Cys 


Gly 


Cys 


Ala 


Gly 








5 










10 










15 


Val 


Cys 


Lys 


Val 


Val 


Pro 


Cys 


He 


Ser 


Thr 


Gly 


Phe 


Glu 


Thr 


Ser 






20 










25 










30 


Gly 


Pro 


Cys 


Pro 


Ser 


Ser 


Arg 


Glu 


Gly 


Phe 


Leu 


Phe 


Phe 


Leu 


Thr 








35 










40 










45 


Gin 


Val 


Thr 


Phe 


Gin 

50 


Pro 


Phe 


Gin 


Phe 


Pro 
55 


Ser 


Phe 


Ser 


Ala 


Leu 
60 


Pro 


Ser 


Asn 


Ser 


Ala 


Asn 


Pro 


Gly 


Val 


Gly 


Ser 


Gin 


Gly 


Gly Arg 










65 










70 










75 


Glu 


Cys 


Pro 


Thr 


Thr 


Phe 


Ser 


Gly 


Gin 


Pro 


Leu 


Thr 


Pro 


Lys 


Pro 








80 










85 










90 


Leu 


Pro 


Pro 


Ser 


He 
95 


Leu 


His 


Pro 


Leu 


Pro 
100 


He 


Gin 


Pro 


Lys 


Cys 
105 


Pro 


Gin 


Leu 


Gly 


Leu 


Ser 


Cys 


He 


Pro 


Val 


Glu 


Gly 


Pro 


Leu 


Pro 








110 










115 










120 


Cys 


Leu 


Ser 


Glu 


Val 


Arg 


Leu 


Cys 


Cys 


Val 


Met 


Gly 


Arg 


Leu 


Cys 








125 










130 










135 


Pro 


Ser 


Pro 


Pro 


Leu 
140 


Ala 


Arg 


Cys 


Thr 


Cys 
145 


Phe 


Leu 


Val 


Cys 


Thr 
150 


Arg 


Cys 


Pro 


Gly 


Gly 
155 


Pro 


Ser 


Leu 


Pro 


Cys 
160 


Gin 











138 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SPLNNOT04 

(B) CLONE: 1559131 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 : 





Met 


Asp 


Lys 


Leu 


Lys 

5 


Lys 


Val 


Leu 


Ser 


Gly 
10 


Gin 


Asp 


Thr 


Glu 


Asp 
15 




Arg 


Ser 


Gly 


Leu 


Ser 
20 


Glu 


Val 


Val 


Glu 


Ala 
25 


Ser 


Ser 


Leu 


Ser 


Trp 
30 




Ser 


Thr 


Arg 


He 


Lys 

35 


Gly 


Phe 


He 


Ala 


Cys 

40 


Phe 


Ala 


He 


Gly 


He 
45 




Leu 


Cys 


Ser 


Leu 


Leu 

50 


Gly 


Thr 


Val 


Leu 


Leu 
55 


Trp 


Val 


Pro 


Arg 


Lys 
60 




Gly 


Leu 


His 


Leu 


Phe 
65 


Ala 


Val 


Phe 


Tyr 


Thr 
70 


Phe 


Gly 


Asn 


He 


Ala 
75 




Ser 


He 


Gly 


Ser 


Thr 

80 


He 


Phe 


Leu 


Met 


Gly 
85 


Pro 


Val 


Lys 


Gin 


Leu 
90 




Lys 


Arg 


Met 


Phe 


Glu 
95 


Pro 


Thr 


Arg 


Leu 


He 
100 


Ala 


Thr 


He 


Met 


Val 
105 




Leu 


Leu 


Cys 


Phe 


Ala 
110 


Leu 


Thr 


Leu 


Cys 


Ser 
115 


Ala 


Phe 


Trp 


Trp 


His 
120 


1 ft 


Asn 


Lys 


Gly 


Leu 


Ala 
125 


Leu 


He 


Phe 


Cys 


He 
130 


Leu 


Gin 


Ser 


Leu 


Ala 

135 




Leu 


Thr 


Trp 


Tyr 


Ser 
140 


Leu 


Ser 


Phe 


He 


Pro 
145 


Phe 


Ala 


Arg 


Asp 


Ala 

150 




Val 


Lys 


Lys 


Cys 


Phe 
155 


Ala 


Val 


Cys 


Leu 


Ala 
160 













(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BLADNOT03 

(B) CLONE: 1601473 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 23 : 








Met 


Gin Ala Lys Tyr 


Ser Ser Thr 


Arg 


Asp 


Met Leu 


Asp 


Asp 


Asp 




5 






10 








15 


Gly 


Asp Thr Thr Met 


Ser Leu His 


Ser 


Gin 


Ala Ser 


Ala 


Thr 


Thr 


20 






25 








30 


Arg 


His Pro Glu Pro 


Arg Arg Thr 


Glu 


His 


Arg Ala 


Pro 


Ser 


Ser 




35 






40 








45 


Thr 


Trp Arg Pro Val 


Ala Leu Thr 


Leu 


Leu 


Thr Leu 


Cys 


Leu 


Val 




50 






55 








60 
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Leu Leu lie Gly Leu Ala Ala Leu Gly Leu Leu Cys Lys Ser Ala 
65 70 75 

Leu 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT12 

(B) CLONE: 1615809 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 : 



Met 


He 


Ser 


Tyr 


He 


Val 


Leu 


Leu 


Ser 


He 


Leu 


Leu 


Trp 


Pro 


Leu 








5 










10 










15 


Val 


Val 


Tyr 


His 


Glu 


Leu 


He 


Gin 


Arg 


Met 


Tyr 


Thr 


Arg 


Leu 


Glu 








20 










25 










30 


Pro 


Leu 


Leu 


Met 


Gin 


Leu 


Asp 


Tyr 


Ser 


Met 


Lys 


Ala 


Glu 


Ala 


Asn 










35 










40 










45 


Ala 


Leu 


His 


His 


Lys 


His 


Asp 


Lys 


Arg 


Lys 


Arg 


Gin 


Gly 


Lys 


Asn 










50 










55 










60 


Ala 


Pro 


Pro 


Gly 


Gly 


Asp 


Glu 


Pro 


Leu 


Ala 


Glu 


Thr 


Glu 


Ser 


Glu 








65 










70 










75 


Ser 


Glu 


Ala 


Glu 


Leu 


Ala 


Gly 


Phe 


Ser 


Pro 


Val 


Val 


Asp 


Val 


Lys 










80 








85 










90 


Lys 


Thr 


Ala 


Leu 


Ala 


Leu 


Ala 


He 


Thr 


Asp 


Ser 


Glu 


Leu 


Ser 


Asp 








95 










100 










105 


Glu 


Glu 


Ala 


Ser 


He 


Leu 


Glu 


Ser 


Gly 


Gly 


Phe 


Ser 


Val 


Ser 


Arg 










110 










115 










120 


Ala 


Thr 


Thr 


Pro 


Gin 


Leu 


Thr 


Asp 


Val 


Ser 


Glu 


Asp 


Leu 


Asp 


Gin 










125 










130 










135 


Gin 


Ser 


Leu 


Pro 


Ser 


Glu 


Pro 


Glu 


Glu 


Thr 


Leu 


Ser 


Arg 


Asp 


Leu 










140 










145 










150 


Gly 


Glu 


Gly 


Glu 


Glu 


Gly 


Glu 


Leu 


Ala 


Pro 


Pro 


Glu 


Asp 


Leu 


Leu 






155 










160 










165 


Gly Arg 


Pro 


Gin 


Ala 


Leu 


Ser 


Arg 


Gin 


Ala 


Leu 


Asp 


Ser 


Glu 


Glu 










170 










175 










180 


Glu 


Glu 


Glu 


Asp 


Val 


Ala 


Ala 


Lys 


Glu 


Thr 


Leu 


Leu 


Arg 


Leu 


Ser 








185 










190 










195 


Ser 


Pro 


Leu 


His 


Phe 


Val 


Asn 


Thr 


His 


Phe 


Asn 


Gly 


Ala 


Gly 


Ser 










200 










205 










210 


Pro 


Gin 


Asp 


Gly 


Val 


Lys 


Cys 


Ser 


Pro 


Gly 


Gly 


Pro 


Val 


Glu 


Thr 






215 










220 










225 


Leu 


Ser 


Pro 


Glu 


Thr 


Val 


Ser 


Gly 


Gly 


Leu 


Thr 


Ala 


Leu 


Pro 


Gly 










230 










235 










240 


Thr 


Leu 


Ser 


Pro 


Pro 


Leu 


Cys 


Leu 


Val 


Gly 


Ser 


Asp 


Pro 


Ala 


Pro 










245 










250 










255 


Ser 


Pro 


Ser 


He 


Leu 


Pro 


Pro 


Val 


Pro 


Gin 


Asp 


Ser 


Pro 


Gin 


Pro 










260 










265 










270 


Leu 


Pro 


Ala 


Pro 


Glu 


Glu 


Glu 


Glu 


Ala 


Leu 


Thr 


Thr 


Glu 


Asp 


Phe 










275 










280 










285 


Glu 


Leu 


Leu 


Asp 


Gin 


Gly 


Glu 


Leu 


Glu 


Gin 


Leu 


Asn 


Ala 


Glu 


Leu 








290 










295 










300 


Gly 


Leu 


Glu 


Pro 


Glu 


Thr 


Pro 


Pro 


Lys 


Pro 


Pro 


Asp 


Ala 


Pro 


Pro 








305 










310 










315 


Leu 


Gly 


Pro 


Asp 


He 


His 


Ser 


Leu 


Val 


Gin 


Ser 


Asp 


Gin 


Glu 


Ala 
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320 325 330 

Gin Ala Val Ala Glu Pro 

335 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNNOT19 

(B) CLONE: 1634813 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 25 : 








Met 


Asn 


Leu 


Trp 


Leu 


Leu 


Ala 


Cys 


Leu 


Val 


Ala 


Gly 


Phe 


Leu 


Gly 










5 










10 










15 


Ala 


Trp 


Ala 


Pro 


Ala 


Val 


His 


Ala 


Gin 


Gly 


Val 


Phe 


Glu 


Asp 


Cys 








20 










25 










30 


Cys 


Leu 


Ala 


Tyr 


His 


Tyr 


Pro 


He 


Gly 


Trp 


Ala 


Val 


Leu 


Arg 


Arg 










35 










40 










45 


Ala 


Trp 


Thr 


Tyr 


Arg 


He 


Gin 


Glu 


Val 


Ser 


Gly 


Ser 


Cys 


Asn 


Leu 










50 










55 










60 


Pro 


Ala 


Ala 


He 


Phe 


Tyr 


Leu 


Pro 


Lys 


Arg 


His 


Arg 


Lys 


Val 


Cys 










65 










70 










75 


Gly 


Asn 


Pro 


Lys 


Ser 


Arg 


Glu 


Val 


Gin 


Arg 


Ala 


Met 


Lys 


Leu 


Leu 










80 










85 










90 


Asp 


Ala 


Arg 


Asn 


Lys 


Val 


Phe 


Ala 


Lys 


Leu 


Arg 


His 


Asn 


Thr 


Gin 










95 










100 










105 


Thr 


Phe 


Gin 


Ala 


Gly 


Pro 


His 


Ala 


Val 


Lys 


Lys 


Leu 


Ser 


Ser 


Gly 










110 










115 










120 


Asn 


Ser 


Lys 


Leu 


Ser 


Ser 


Ser 


Lys 


Phe 


Ser 


Asn 


Pro 


He 


Ser 


Ser 








125 










130 










135 


Ser 


Lys 


Arg 


Asn 


Val 


Ser 


Leu 


Leu 


He 


Ser 


Ala 


Asn 


Ser 


Gly 


Leu 










140 










145 










150 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UTRSNOT06 

(B) CLONE: 1638407 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 : 

Met Ala Pro Pro Ala Leu Gin Arg Gly Gin Arg Val Ala Ala Val 
5 10 15 
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Ala 


Val 


Gly 


Ser 


Gin 


Ala 


Val 


Leu 


Gin 


He 


Leu 


Ser 


Arg 


Val 


Ser 










20 










25 










30 


Gly 


Arg 


Gin 


Ala 


Pro 


Pro 


Gin 


Pro 


Ser 


Gly 


Ser 


Gly 


Gly 


Val 


Gly 










35 










40 










45 


Ala 


Gly 


Pro 


Val 


Val 


Val 


Pro 


Asp 


Gly 


Gly 


Gly 


Glu 


Gly 


Pro 


Gin 










50 










55 










60 


Pro 


His 


Pro 


Ser 


Ser 


Ser 


Gin 


Ser 


Pro 


Pro 


Asp 


Leu 


Pro 


Leu 


Lys 










65 










70 










75 


Ala 


Gly 


Asp 


Thr 


Val 


Met 


Gly 


Lys 


Gin 


Ala 


Gin 


Arg 


Asp 


He 


Arg 










80 










85 










90 


Leu 


Arg 


Val 


Arg 


Ala 


Glu 


Tyr 


Cys 


Glu 


His 


Gly 


Pro 


Ala 


Leu 


Glu 










95 










100 










105 


Gin 


Gly 


Val 


Ala 


Ser 


Arg 


Arg 


Pro 


Gin 


Ala 


Leu 


Ala 


Arg 


Gin 


Leu 










110 










115 










120 


Asp 


Val 


Phe 


Gly 


Gin 


Ala 


Thr 


Ala 


Val 


Leu 


Arg 


Ser 


Arg 


Asp 


Leu 










125 










130 










135 


Gly 


Ser 


Val 


Val 


Cys 


Asp 


He 


Lys 


Phe 


Ser 


Glu 


Leu 


Ser 


Tyr 


Leu 










140 










145 










150 


Asp 


Ala 


Phe 


Trp 


Gly 


Asp 


Tyr 


Leu 


Ser 


Gly 


Ala 


Leu 


Leu 


Gin 


Ala 










155 










160 










165 


Leu 


Arg 


Gly 


Val 


Phe 


Leu 


Thr 


Glu 


Ala 


Leu 


Arg 


Glu 


Ala 


Val 


Gly 










170 










175 










180 


Arg 


Glu 


Ala 


Val 


Arg 


Leu 


Leu 


Val 


Ser 


Val 


Asp 


Glu 


Ala 


Asp 


Tyr 










185 










190 










195 


Glu 


Ala 


Gly 


Arg 


Arg 


Arg 


Leu 


Leu 


Leu 


Met 


Ala 


Glu 


Glu 


Gly 


Gly 










200 










205 










210 


Arg 


Arg 


Pro 


Thr 


Glu 


Ala 


Ser 


























215 






















(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


27: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 504 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSTUT08 

(B) CLONE: 1653112 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 : 



Met 


Ser 


Gin 


Pro 


Arg 


Thr 


Pro 


Glu 


Gin 


Ala 


Leu 


Asp 


Thr 


Pro 


Gly 










5 










10 










15 


Asp 


c y s 


Pro 


Pro 


Gly 


Arg 


Arg 


Asp 


Glu 


Asp 


Ala 


Gly 


Glu 


Gly 


He 










20 










25 










30 


Gin 


Cys 


Ser 


Gin 


Arg 


Met 


Leu 


Ser 


Phe 


Ser 


Asp 


Ala 


Leu 


Leu 


Ser 










35 










40 










45 


He 


He 


Ala 


Thr 


Val 


Met 


He 


Leu 


Pro 


Val 


Thr 


His 


Thr 


Glu 


He 










50 










55 










60 


Ser 


Pro 


Glu 


Gin 


Gin 


Phe 


Asp 


Arg 


Ser 


Val 


Gin 


Arg 


Leu 


Leu 


Ala 










65 










70 










75 


Thr 


Arg 


He 


Ala 


Val 


Tyr 


Leu 


Met 


Thr 


Phe 


Leu 


He 


Val 


Thr 


Val 










80 










85 










90 


Ala 


Trp 


Ala 


Ala 


His 


Thr 


Arg 


Leu 


Phe 


Gin 


Val 


Val 


Gly 


Lys 


Thr 










95 










100 










105 


Asp 


Asp 


Thr 


Leu 


Ala 


Leu 


Leu 


Asn 


Leu 


Ala 


Cys 


Met 


Met 


Thr 


He 
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PF-0459 US 

110 115 120 

Thr Phe Leu Pro Tyr Thr Phe Ser Leu Met Val Thr Phe Pro Asp 
125 130 135 

Val Pro Leu Gly lie Phe Leu Phe Cys Val Cys Val He Ala He 
140 145 150 

Gly Val Val Gin Ala Leu He Val Gly Tyr Ala Phe His Phe Pro 
155 160 165 

His Leu Leu Ser Pro Gin He Gin Arg Ser Ala His Arg Ala Leu 
170 175 180 

Tvr Arcr Arq His Val Leu Gly He Val Leu Gin Gly Pro Ala Leu 
185 190 195 

Cvs Phe Ala Ala Ala He Phe Ser Leu Phe Phe Val Pro Leu Ser 
200 205 210 

Tvr Leu Leu Met Val Thr Val He Leu Leu Pro Tyr Val Ser Lys 
215 220 ( 225 

Val Thr Gly Trp Cys Arg Asp Arg Leu Leu Gly His Arg Glu Pro 
230 235 240 

Ser Ala His Pro Val Glu Val Phe Ser Phe Asp Leu His Glu Pro 
245 250 255 

Leu Ser Lys Glu Arg Val Glu Ala Phe Ser Asp Gly Val Tyr Ala 
260 265 270 

He Val Ala Thr Leu Leu He Leu Asp He Cys Glu Asp Asn Val 
275 280 285 

Pro Asp Pro Lys Asp Val Lys Glu Arg Phe Ser Gly Ser Leu Val 
290 295 300 

Ala Ala Leu Ser Ala Thr Gly Pro Arg Phe Leu Ala Tyr Phe Gly 
305 310 315 

Ser Phe Ala Thr Val Gly Leu Leu Trp Phe Ala His His Ser Leu 
320 325 330 

Phe Leu His Val Arg Lys Ala Thr Arg Ala Met Gly Leu Leu Asn 
335 340 345 

Thr Leu Ser Leu Ala Phe Val Gly Gly Leu Pro Leu Ala Tyr Gin 
350 355 360 

Gin Thr Ser Ala Phe Ala Arg Gin Pro Arg Asp Glu Leu Glu Arg 
365 370 375 

Val Arg Val Ser Cys Thr He He Phe Leu Ala Ser He Phe Gin 
380 385 390 

Leu Ala Met Trp Thr Thr Ala Leu Leu His Gin Ala Glu Thr Leu 
395 400 405 

Gin Pro Ser Val Trp Phe Gly Gly Arg Glu His Val Leu Met Phe 
410 415 420 

Ala Lys Leu Ala Leu Tyr Pro Cys Ala Ser Leu Leu Ala Phe Ala 
425 430 435 

Ser Thr Cys Leu Leu Ser Arg Phe Ser Val Gly He Phe His Leu 
440 445 450 

Met Gin He Ala Val Pro Cys Ala Phe Leu Leu Leu Arg Leu Leu 
455 460 465 

Val Gly Leu Ala Leu Ala Thr Leu Arg Val Leu Arg Gly Leu Ala 
470 475 480 

Arg Pro Glu His Pro Pro Pro Ala Pro Thr Gly Gin Asp Asp Pro 
485 490 495 

Gin Ser Gin Leu Leu Pro Ala Pro Cys 
500 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 
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PF-0459 US 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT09 

(B) CLONE: 1664634 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 : 



Met 


Ala 


Ala 


Arg 


Leu 
5 


Asp 


Gly 


Gly 


Phe 


Ala 
10 


Ala 


Val 


Ser 


Arg 


Ala 
15 


Phe 


His 


Glu 


He 


Arg 
20 


Ala 


Arg 


Asn 


Pro 


Ala 
25 


Phe 


Gin 


Pro 


Gin 


Thr 
30 


Leu 


Met 


Asp 


Phe 


Gly 
35 


Ser 


Gly 


Thr 


Gly 


Ser 
40 


Val 


Thr 


Trp 


Ala 


Ala 
45 


His 


Ser 


He 


Trp 


Gly 
50 


Gin 


Ser 


Leu 


Arg 


Glu 
55 


Tyr 


Met 


Cys 


Val 


Asp 
60 


Arg 


Ser 


Ala 


Ala 


Met 
65 


Leu 


Val 


Leu 


Ala 


Glu 
70 


Lys 


Leu 


Leu 


Thr 


Gly 
75 


Gly 


Ser 


Glu 


Ser 


Gly 
80 


Glu 


Pro 


Tyr 


He 


Pro 
85 


Gly 


Val 


Phe 


Phe 


Arg 
90 


Gin 


Phe 


Leu 


Pro 


Val 
95 


Ser 


Pro 


Lys 


Val 


Gin 
100 


Phe 


Asp 


Val 


Val 


Val 
105 


Ser 


Ala 


Phe 


Ser 


Leu 
110 


Ser 


Asp 


Gin 


Leu 


Leu 
115 


Thr 


Phe 


He 


Leu 


Ser 
120 


Cys 


Asn 


Ser 


Ser 


Leu 


Leu 


His 


He 


Phe 


Pro 


Phe 


Cys 


Glu 


Gin 


Val 








125 










130 










135 


Leu 


Val 


Glu 


Asn 


Gly 
140 


Thr 


Lys 


Ala 


Gly 


His 
145 


Ser 


Leu 


Leu 


Met 


Asp 
150 


Ala 


Arg 


Asp 


Leu 


Val 
155 


Leu 


Lys 


Gly 


Lys 


Glu 
160 


Lys 


Ser 


Pro 


Leu 


Asp 
165 


Pro 


Arg 


Pro 


Gly 


Phe 
170 


Val 


Phe 


Ala 


Pro 


Cys 
175 


Pro 


His 


Glu 


Leu 


Pro 
180 


Cys 


Pro 


Gin 


Leu 


Thr 
185 


Asn 


Leu 


Ala 


Cys 


Ser 
190 


Phe 


Ser 


Gin 


Ala 


Tyr 
195 


His 


Pro 


He 


Pro 


Phe 
200 


Ser 


Trp 


Asn 


Lys 


Lys 
205 


Pro 


Lys 


Glu 


Glu 


Lys 
210 


Phe 


Ser 


Met 


Val 


He 
215 


Leu 


Ala 


Arg 


Gly 


Ser 
220 


Pro 


Glu 


Glu 


Ala 


His 
225 


Arg 


Trp 


Pro 


Arg 


He 
230 


Thr 


Gin 


Pro 


Val 


Leu 
235 


Lys 


Arg 


Pro 


Arg 


His 
240 


Val 


His 


Cys 


His 


Leu 
245 


Cys 


Cys 


Pro 


Asp 


Gly 
250 


His 


Met 


Gin 


His 


Ala 
255 


Val 


Leu 


Thr 


Ala 


Arg 
260 


Arg 


His 


Gly 


Arg 


Tyr 
265 


Gly 


Gly 


Cys 


Asp 


Gin 
270 


Asn 


Gin 


Trp 


Asp 


Val 
275 


Ala 


Gly 


Ser 


Cys 


Ser 
280 


Pro 


Arg 


Gin 


His 


Leu 
285 


Phe 


Pro 


Gin 


Gly 


Phe 
290 


Val 


Ser 


Leu 


Cys 


Pro 
295 


Cys 


Gin 


Leu 


Leu 


Gly 
300 


Arg 


Ser 


Phe 


Thr 


Cys 
305 


Ala 


Tyr 


Ser 


Val 


Cys 
310 


Val 


Ser 


Ser 


He 


Tyr 
315 


Gly 


Ser 


Gly 


Ser 


Leu 























320 



(2) INFORMATION 



FOR 



SEQ 



ID NO: 



29: 



(i) SEQUENCE CHARACTERISTICS: 
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OIPE 



RAW SEQUENCE LISTING DATE: 02/07/98 

PATENT APPLICATION US/09/002,485 TIME: 14:21:12 

INPUT SET: S23256.raw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages. 




SEQUENCE LISTING 

General Information: 

(i) APPLICANT: Lai, Preeti ^* 1 

Hillman, Jennifer L. 
Corley, Neil C. 
Guegler, Karl J. 
Baugh, Mariah 
Sather, Susan 
Shah, Purvi 

<ii) TITLE OF INVENTION: HUMAN SIGNAL PEPTIDE-CONTAINING PROTEINS 



(iii) NUMBER OF SEQUENCES : 154 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

(B) STREET: 3174 PORTER DRIVE 

(C) CITY: PALO ALTO 

(D) STATE: CALIFORNIA 

(E) COUNTRY: USA 

(F) ZIP: 94304 



(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Word Perfect 6,1 for Windows/MS-DOS 6.2 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: TO BE ASSIGNED 

(B) FILING DATE: HEREWITH 

(C) CLASSIFICATION: 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: BILLINGS, LUCY J* 

(B) REGISTRATION NUMBER: 36,749 

(C) REFERENCE/DOCKET NUMBER:PF-0459 US 



PAGE: 2 



47 
48 
49 
50 
51 
52 
53 
54 
55 
56 
57 
58 
59 
60 
61 
62 
63 
64 
65 
66 
67 
68 
69 
70 
71 
72 
73 
74 
75 
76 
77 
78 
79 
80 
81 
82 
83 
84 
85 
86 
87 
88 
89 
90 
91 
92 
93 
94 
95 
96 
97 
98 
99 



RAW SEQUENCE LISTING DATE: 02/07/98 

PATENT APPLICATION US/09/002,485 TIME: 14:21 : 16 

INPUT SET: S23256.raw 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (650) 855-0555 

(B) TELEFAX: (650) 845-4166 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEARNOT0 1 

(B) CLONE: 305841 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 



Met 


Ala 


nlu 


1 11 J. 


5 


kj j- y 






v3 j_ _y 


CJ o y" 

10 


T 1 yo 
1 ip 


Gin 


Gin 


iL P 


Arg 
15 


Arg 


Cys 


Leu 


Ser 


Ala 
20 


Arg 


Asp 


Gly 


Ser 


A yet 

25 


Met 


Leu 


Leu 


Leu 


Leu 
30 


Leu 


Leu 


Leu 


Gly 


Ser 
35 


Gly 


Gin 


Gly 


Pro 


Gin 
40 


Gin 


Val 


Gly 


Ala 


Gly 
45 


Gin 


Thr 


Phe 


Glu 


Tyr 
50 


Leu 


Lys 


Arg 


Glu 


His 
55 


Ser 


Leu 


Ser 


Lys 


Pro 
60 


Tyr 


Gin 


Gly 


Val 


Gly 
65 


Thr 


Gly 


Ser 


Ser 


Ser 
70 


Leu 


Trp 


Asn 


Leu 


Met 
75 


Gly Asn 


Ala 


Met 


Val 


Met 


Thr 


Gin 


Tyr 


He 


Arg 


Leu 


Thr 


Pro 


Asp 










80 










85 










90 


Met 


Gin 


Ser 


Lys 


Gin 
95 


Gly 


Ala 


Leu 


Trp 


Asn 
100 


Arg 


Val 


Pro 


Cys 


Phe 
105 


Leu 


Arg 


Asp 


Trp 


Glu 
110 


Leu 


Gin 


Val 


His 


Phe 
115 


Lys 


He 


His 


Gly 


Gin 
120 


Gly 


Lys 


Lys 


Asn 


Leu 
125 


His 


Gly 


Asp 


Gly 


Leu 
130 


Ala 


He 


Trp 


Tyr 


Thr 
135 


Lys 


Asp 


Arg 


Met 


Gin 
140 


Pro 


Gly 


Pro 


Val 


Phe 
145 


Gly 


Asn 


Met 


Asp 


Lys 
150 


Phe 


Val 


Gly 


Leu 


Gly 
155 


Val 


Phe 


Val 


Asp 


Thr 
160 


Tyr 


Pro 


Asn 


Glu 


Glu 
165 


Lys 


Gin 


Gin 


Glu 


Arg 
170 


Val 


Phe 


Pro 


Tyr 


He 
175 


Ser 


Ala 


Met 


Val 


Asn 
180 


Asn 


Gly 


Ser 


Leu 


Ser 


Tyr 


Asp 


His 


Glu 


Arg 


Asp 


Gly Arg 


Pro 


Thr 










185 










190 










195 


Glu 


Leu 


Gly 


Gly 


Cys 
200 


Thr 


Ala 


He 


Val 


Arg 
205 


Asn 


Leu 


His 


Tyr 


Asp 
210 


Thr 


Phe 


Leu 


Val 


He 
215 


Arg 


Tyr 


Val 


Lys 


Arg 
220 


His 


Leu 


Thr 


He 


Met 
225 


Met 


Asp 


He 


Asp 


Gly 
230 


Lys 


His 


Glu 


Trp 


Arg 
235 


Asp 


Cys 


He 


Glu 


Val 
240 


Pro 


Gly 


Val 


Arg 


Leu 


Pro 


Arg 


Gly 


Tyr 


Tyr 


Phe 


Gly Thr 


Ser 


Ser 



PAGE: 3 



100 
101 
102 
103 
104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
^ 121 
— 122 
X 123 
S 124 
r\i 125 

rl 126 

p2 127 
128 
^ 12 9 
130 
;a 131 
n: 132 
Z 133 
134 
C 135 
*J 136 
V 137 
138 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 



# # 

RAW SEQUENCE LISTING DATE: 02/07/98 

PATENT APPLICATION US/09/002,485 TIME: 14:21:19 

INPUT SET: S23256.raw 
245 250 255 



lie Thr Gly Asp 


Leu 


Ser 


Asp 


Asn 


His 


Asp 


Val 


He 


Ser 


Leu 


Lys 










260 










265 










270 


Leu 


Phe 


Glu 


Leu 


Thr 


Val 


Glu 


Arg 


Thr 


Pro 


Glu 


Glu 


Glu 


Lys 


Leu 










275 










280 










285 


His 


Arg 


Asp 


Val 


Phe 


Leu 


Pro 


Ser 


Val 


Asp 


Asn 


Met 


Lys 


Leu 


Pro 










290 










295 










300 


Glu 


Met 


Thr 


Ala 


Pro 


Leu 


Pro 


Pro 


Leu 


Ser 


Gly 


Leu 


Ala 


Leu 


Phe 










305 










310 










315 


Leu 


lie 


Val 


Phe 


Phe 


Ser 


Leu 


Val 


Phe 


Ser 


Val 


Phe 


Ala 


He 


Val 










320 










325 










330 


lie 


Gly 


He 


He 


Leu 


Tyr 


Asn 


Lys 


Trp 


Gin 


Glu 


Gin 


Ser 


Arg 


Lys 



335 340 345 

Arg Phe Tyr 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: EOSIHET02 

(B) CLONE: 322866 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 : 



Met 


Gly 


Met 


Ser 


Ser 


Leu 


Lys 


Leu 


Leu 


Lys 


Tyr 


Val Leu 


Phe Phe 










5 










10 






15 


Phe 


Asn 


Leu 


Leu 


Phe 


Trp 


He 


Cys 


Gly 


Cys 


Cys 


He Leu 


Gly Phe 










20 










25 






30 


Gly He 


Tyr 


Leu 


Leu 


He 


His 


Asn 


Asn 


Phe 


Gly 


Val Leu 


Phe His 










35 










40 






45 


Asn 


Leu 


Pro 


Ser 


Leu 


Thr 


Leu 


Gly 


Asn 


Val 


Phe 


Val He 


Val Gly 










50 










55 






60 


Ser 


He 


He 


Met 


Val 


Val 


Ala 


Phe 


Leu 


Gly 


Cys 


Met Gly 


Ser He 










65 










70 






75 


Lys 


Glu 


Asn 


Lys 


Cys 


Leu 


Leu 


Met 


Ser 


Phe 


Phe 


He Leu 


Leu Leu 










80 










85 






90 


He 


He 


Leu 


Leu 


Ala 


GlU 


Val 


Thr 


Leu 


Ala 


He 


Leu Leu 


Phe Val 










95 










100 






105 


Tyr 


Glu 


Gin 


Lys 


Leu 


Asn 


Glu 


Tyr 


Val 


Ala 


Lys 


Gly Leu 


Thr Asp 










110 










115 






120 


Ser 


He 


His 


Arg 


Tyr 


His 


Ser 


Asp 


Asn 


Ser 


Thr 


Lys Ala 


Ala Trp 










125 










130 






135 


Asp 


Ser 


He 


Gin 


Ser 


Phe 


Leu 


Gin 


Cys 


Cys 


Gly 


He Asn 


Gly Thr 










140 










145 






150 


Ser 


Asp 


Leu 


Asp 


Ser 


Gly 


Ser 


Pro 


Ala 


Ser 


Cys 


Pro Ser 


Asp Arg 



PAGE: 4 RAW SEQUENCE LISTING DATE: 02/07/98 

PATENT APPLICATION US/09/002,485 TIME: 14:21:23 

INPUT SET: S23256. raw 

153 155 160 165 

154 Lys Val Glu Gly Cys Tyr Ala Lys Glu Asp Phe Gly Phe lie Gin 

155 170 175 180 

156 Phe Pro Val Tyr Arg Asn His His His Leu Cys Met Cys Asp 

157 185 190 
158 

159 
160 
161 

162 (2) INFORMATION FOR SEQ ID NO: 3: 
163 

164 (i) SEQUENCE CHARACTERISTICS: 

165 (A) LENGTH: 342 amino acids 

166 (B) TYPE: amino acid 

167 (C) STRANDEDNESS: single 

168 (D) TOPOLOGY: linear 
169 

170 (Vii) IMMEDIATE SOURCE: 

171 (A) LIBRARY: BEPINOT01 

172 (B) CLONE: 546656 
O 17 3 

174 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 : 

^: 175 





176 


Met 


Ser 


Leu 


His 


Gly 


Lys 


Arg 


Lys 


Glu 


He 


Tyr 


Lys 


Tyr 


Glu 


Ala 




177 










5 










10 










15 




178 


Pro 


Trp 


Thr 


Val 


Tyr 


Ala 


Met 


Asn 


Trp 


Ser 


Val 


Arg 


Pro 


Asp 


Lys 




179 










20 










25 










30 




180 


Arg 


Phe 


Arg 


Leu 


Ala 


Leu 


Gly 


Ser 


Phe 


Val 


Glu 


Glu 


Tyr 


Asn 


Asn 




181 










35 










40 










45 




182 


Lys 


Val 


Gin 


Leu 


Val 


Gly 


Leu 


Asp 


Glu 


Glu 


Ser 


Ser 


Glu 


Phe 


He 




183 










50 










55 










60 




184 


Cys 


Arg 


Asn 


Thr 


Phe 


Asp 


His 


Pro 


Tyr 


Pro 


Thr 


Thr 


Lys 


Leu 


Met 




185 










65 










70 










75 




186 


Trp 


He 


Pro 


Asp 


Thr 


Lys 


Gly 


Val 


Tyr 


Pro 


Asp 


Leu 


Leu 


Ala 


Thr 




187 










80 










85 










90 




188 


Ser 


Gly 


Asp 


Tyr 


Leu 


Arg 


Val 


Trp 


Arg 


Val 


Gly 


Glu 


Thr 


Glu 


Thr 




189 










95 










100 










105 




190 


Arg 


Leu 


Glu 


Cys 


Leu 


Leu 


Asn 


Asn 


Asn 


Lys 


Asn 


Ser 


Asp 


Phe 


Cys 




191 










110 










115 










120 




192 


Ala 


Pro 


Leu 


Thr 


Ser 


Phe 


Asp 


Trp 


Asn 


Glu 


Val 


Asp 


Pro 


Tyr 


Leu 




193 










125 










130 










135 




194 


Leu Gly 


Thr 


Ser 


Ser 


He 


Asp 


Thr 


Thr 


Cys 


Thr 


He 


Trp 


Gly 


Leu 




195 










140 










145 










150 




196 


Glu 


Thr 


Gly 


Gin 


Val 


Leu 


Gly 


Arg 


Val 


Asn 


Leu 


Val 


Ser 


Gly 


His 




197 










155 










160 










165 




198 


Val 


Lys 


Thr 


Gin 


Leu 


He 


Ala 


His 


Asp 


Lys 


Glu 


Val 


Tyr 


Asp 


He 




199 










170 










175 










180 




200 


Ala 


Phe 


Ser 


Arg 


Ala 


Gly 


Gly 


Gly 


Arg 


Asp 


Met 


Phe 


Ala 


Ser 


Val 




201 










185 










190 










195 




202 


Gly Ala 


Asp 


Gly 


Ser 


Val 


Arg 


Met 


Phe 


Asp 


Leu 


Arg 


His 


Leu 


Glu 




203 










200 










205 










210 




204 


His 


Ser 


Thr 


He 


He 


Tyr 


Glu 


Asp 


Pro 


Gin 


His 


His 


Pro 


Leu 


Leu 




205 










215 










220 










225 



PAGE: 5 RAW SEQUENCE LISTING DATE: 02/07/98 

PATENT APPLICATION US/09/002,485 TIME: 14:21:27 

INPUT SET: S23256.mw 



206 


Arg 


Leu 


Cys Trp 


Asn 


Lys 


Gin 


Asp 


Pro 


Asn 


Tyr 


Leu 


Ala 


Thr 


Met 


207 








230 










235 










240 


208 


Ala 


Met 


Asp Gly 


Met 


Glu 


Val 


Val 


He 


Leu 


Asp 


Val 


Arg 


Val 


Pro 


209 








245 










250 










255 


210 


Cys 


Thr 


Pro Val 


Ala 


Arg 


Leu 


Asn 


Asn 


His 


Arg 


Ala 


Cys 


Val 


Asn 


211 






260 










265 










270 


212 


Gly lie 


Ala Trp 


Ala 


Pro 


His 


Ser 


Ser 


Cys 


His 


He 


Cys 


Thr 


Ala 


213 








275 










280 










285 


214 


Ala 


Asp 


Asp His 


Gin 


Ala 


Leu 


He 


Trp 


Asp 


He 


Gin 


Gin 


Met 


Pro 


215 








290 










295 










300 


216 


Arg 


Ala 


He Glu 


Asp 


Pro 


He 


Leu 


Ala 


Tyr 


Thr 


Ala 


Glu 


Gly Glu 


217 






305 










310 










315 


218 


lie 


Asn 


Asn Val 


Gin 


Trp 


Ala 


Ser 


Thr 


Gin 


Pro 


Asp 


Trp 


He 


Ala 


219 








320 










325 










330 


220 


lie 


Cys 


Tyr Asn 


Asn 


Cys 


Leu 


Glu 


He 


Leu 


Arg 


Val 








221 








335 










340 












222 






























223 






























224 






























225 






























226 




INFORMATION 


FOR 


SEQ 


ID NO: 




4: 












227 






























228 




(i) 


i SEQUENCE CHARACTERISTICS: 














229 






(A) LENGTH: 6 56 amino 


acids 












230 






(B) TYPE: 


amino acid 
















231 






(C) STRANDEDNESS: 


single 














232 






(D) TOPOLOGY: 


linear 
















233 






























234 




(Vii) IMMEDIATE SOURCE: 


















235 






(A) LIBRARY: SYNORAT03 














236 






(B) CLONE 


: 693453 


















237 






























238 




(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 4 : 








239 






























240 


Met 


Glu 


Glu Leu 


Asp 


Gly 


Glu 


Pro 


Thr 


Val 


Thr 


Leu 


He 


Pro 


Gly 


241 








5 










10 










15 


242 


Val 


Asn 


Ser Lys 


Lys 


Asn 


Gin 


Met 


Tyr 


Phe 


Asp 


Trp Gly Pro 


Gly 


243 








20 










25 










30 


244 


Glu 


Met 


Leu Val 


Cys 


Glu 


Thr 


Ser 


Phe 


Asn 


Lys 


Lys 


Glu 


Lys 


Ser 


245 








35 










40 










45 


246 


Glu 


Met 


Val Pro 


Ser 


Cys 


Pro 


Phe 


He 


Tyr 


He 


He 


Arg 


Lys 


Asp 


247 








50 










55 










60 


248 


Val 


Asp 


Val Tyr 


Ser 


Gin 


He 


Leu 


Arg 


Lys 


Leu 


Phe 


Asn 


Glu 


Ser 


249 








65 










70 










75 


250 


His 


Gly 


He Phe 


Leu 


Gly 


Leu 


Gin 


Arg 


He 


Asp 


Glu 


Glu 


Leu 


Thr 


251 








80 










85 










90 


252 


Gly 


Lys 


Ser Arg 


Lys 


Ser 


Gin 


Leu 


Val 


Arg 


Val 


Ser 


Lys 


Asn 


Tyr 


253 








95 










100 










105 


254 


Arg 


Ser 


Val He 


Arg 


Ala 


Cys 


Met 


Glu 


Glu 


Met 


His 


Gin 


Val 


Ala 


255 








110 










115 










120 


256 


He 


Ala 


Ala Lys 


Asp 


Pro 


Ala 


Asn 


Gly 


Arg 


Gin 


Phe 


Ser 


Ser 


Gin 


257 








125 










130 










135 


258 


Val 


Ser 


He Leu 


Ser 


Ala 


Met 


Glu 


Leu 


He 


Trp 


Asn 


Leu 


Cys 


Glu 



C3 

ru 

CO 

in 

ru 
w 



PAGE: 1 SEQUENCE VERIFICATION REPORT DATE: 02/07798 

PATENT APPLICATION US/09/002,485 TIME: 14:21:31 

INPUT SET: S23256.mw 



Line Error Original Text 

36 Wrong application Serial Number (A) APPLICATION NUMBER: TO BE ASSIGNED 



PF-0459 US 

(A) LENGTH : 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSTUT10 

(B) CLONE: 1690990 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 : 



Met 


Asp 


Asn 


Lys 


Gly 


He 


Tyr 


Pro 


Gly 


Ala 


Val 


Phe 


Tyr 


His 


Asp 










5 










10 










15 


Ser 


Phe 


Thr 


Glu 


Ser 


Arg 


Val 


Val 


Leu 


Leu 


Arg 


He 


Arg 


Thr 


Leu 










20 










25 










30 


Val 


Pro 


Tyr 


Ser 


Pro 


Pro 


Asp 


Cys 


Pro 


Thr 


Thr 


Thr 


Thr 


Ala 


Tyr 










35 










40 










45 


Ser 


Pro 


Phe 


Pro 


Asn 


His 


Gly 


Gin 


Gin 


He 


Glu 


Leu 


Leu 


Thr 


Glu 










50 










55 










60 


Val 


Ser 


Phe 


Arg 


Trp 


He 


Ser 


Gin 


Pro 


Phe 


Pro 


His 


Arg 


Pro 


His 










65 










70 










75 


Arg 


Glu 


Thr 


Val 


Thr 


Asp 


Cys 


Tyr 


Ser 


Pro 


Asn 


Thr 


Gin 


Val 


Lys 










80 










85 










90 


Ser 


Asn 


Ala 


Gly 


Arg 


Asn 


Asn 


Ser 


Lys 


Ser 


Phe 


Asn 


Phe 


Leu 


He 










95 










100 










105 


Leu 


Leu 


Leu 


Lys 


He 


Leu 


Thr 


Glu 


Ala 


Ser 


Arg 


Phe 
















110 










115 












(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


30: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: DUODNOT02 

(B) CLONE: 1704050 





(xi) SEQUENCE i 


DESCRIPTION: 


SEQ 


ID NO: : 


30 : 








Met 


Ala 


Arg 


Arg 


Ser 


Arg 


His 


Arg 


Leu 


Leu 


Leu 


Leu 


Leu 


Leu 


Arg 










5 










10 










15 


Tyr 


Leu 


Val 


Val 


Ala 


Leu 


Gly 


Tyr 


His 


Lys 


Ala 


Tyr 


Gly 


Phe 


Ser 










20 










25 










30 


Ala 


Pro 


Lys 


Asp 


Gin 


Gin 


Val 


Val 


Thr 


Ala 


Val 


Glu 


Tyr 


Gin 


Glu 










35 










40 










45 


Ala 


He 


Leu 


Ala 


Cys 


Lys 


Thr 


Pro 


Lys 


Lys 


Thr 


Val 


Ser 


Ser 


Arg 










50 










55 










60 


Leu 


Glu 


Trp 


Lys 


Lys 


Leu 


Gly 


Arg 


Ser 


Val 


Ser 


Phe 


Val 


Tyr 


Tyr 










65 










70 










75 


Gin 


Gin 


Thr 


Leu 


Gin 


Gly 


Asp 


Phe 


Lys 


Asn 


Arg 


Ala 


Glu 


Met 


He 










80 










85 










90 


Asp 


Phe 


Asn 


He 


Arg 


He 


Lys 


Asn 


Val 


Thr 


Arg 


Ser 


Asp 


Ala 


Gly 










95 










100 










105 




Tyr 


Arg 


Cys 


Glu 


Val 


Ser 


Ala 


Pro 


Ser 


Glu 


Gin 


Gly 


Gin 


Asn 










110 










115 










120 



145 




PF-0459 US 



Leu 


Glu 


Glu Asp 


Thr 


Val 


Thr 


Leu 


Glu 


Val 


Leu 


Val 


Ala 


Pro 


Ala 








125 










130 










135 


Val 


Pro 


Ser Cys 


Glu 


Val 


Pro 


Ser 


Ser 


Ala 


Leu 


Ser 


Gly 


Thr 


Val 








140 










145 










150 


Val 


Glu 


Leu Arg 


Cys 


Gin 


Asp 


Lys 


Glu 


Gly 


Asn 


Pro 


Ala 


Pro 


Glu 








155 










160 










165 


Tyr 


Thr 


Trp Phe 


Lys 


Asp 


Gly 


He 


Arg 


Leu 


Leu 


Glu 


Asn 


Pro 


Arg 








170 










175 










180 


Leu 


Gly 


Ser Gin 


Ser 


Thr 


Asn 


Ser 


Ser 


Tyr 


Thr 


Met 


Asn 


Thr 


Lys 








185 










190 










195 


Thr 


Gly 


Thr Leu 


Gin 


Phe 


Asn 


Thr 


Val 


Ser 


Lys 


Leu 


Asp 


Thr 


Gly 








200 










205 










210 


Glu 


Tyr 


Ser Cys 


Glu 


Ala 


Arg 


Asn 


Ser 


Val 


Gly 


Tyr 


Arg 


Arg 


Cys 








215 










220 










225 


Pro 


Gly 


Lys Arg 


Met 


Gin 


Val 


Asp 


Asp 


Leu 


Asn 


He 


Ser 


Gly 


He 








230 










235 










240 


He 


Ala 


Ala Val 


Val 


Val 


Val 


Ala 


Leu 


Val 


He 


Ser 


Val 


Cys 


Gly 








245 










250 










255 


Leu 


Gly 


Val Cys 


Tyr 


Ala 


Gin 


Arg 


Lys 


Gly 


Tyr 


Phe 


Ser 


Lys 


Glu 








260 










265 










270 


Thr 


Ser 


Phe Gin 


Lys 


Ser 


Asn 


Ser 


Ser 


Ser 


Lys 


Ala 


Thr 


Thr 


Met 








275 










280 










285 


Ser 


Glu 


Asn Asp 


Phe 


Lys 


His 


Thr 


Lys 


Ser 


Phe 


He 


He 












^ Z> \J 










295 












(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


31: 














(i) 


SEQUENCE CHARACTERISTICS: 


















(A) LENGTH: 118 amino 


acids 
















(B) TYPE: 


amino acid 




















(C) STRANDEDNESS : 


single 


















(D) TOPOLOGY: 


linear 


















(vii) IMMEDIATE SOURCE; 






















(A) LIBRARY: PROSNOT1 6 


















(B) CLONE: 


: 1711840 


















(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 31 : 








Met 


Gin 


His Arg 


Gly 


Phe 


Leu 


Leu 


Leu 


Thr 


Leu 


Leu 


Ala 


Leu 


Leu 








5 










10 










15 


Ala 


Leu 


Thr Ser 


Ala 


Val 


Ala 


Lys 


Lys 


Gin 


Asp 


Lys 


Val 


Lys 


Lys 








20 










25 










30 


Gly 


Gly 


Pro Gly 


Ser 


Glu 


Cys 


Ala 


Glu 


Trp 


Ala 


Trp 


Gly 


Pro 


Cys 








35 










40 










45 


Thr 


Pro 


Ser Ser 


Lys 


Gly 


Phe 


Ala 


Ala 


Val 


Gly 


Phe 


Pro 


Arg 


Gly 








50 










55 










60 


Pro 


Pro 


Trp Gly 


Gly 


Pro 


Arg 


Thr 


Gin 


Pro 


Ala 


Val 


Leu 


Val 


Glu 








65 










70 










75 


Arg 


Val 


Ala Pro 


Gly 


Lys 


Leu 


Glu 


Arg 


Lys 


Glu 


Phe 


Trp 


Ala 


Pro 








80 










85 










90 


Gly 


Leu 


Trp Lys 


Val 


Gly 


Gin 


He 


Phe 


Trp 


Lys 


Lys 


Thr 


Trp 


Arg 








95 










100 










105 


Val 


Cys 


Arg Ser 


Val 


Lys 


Trp 


Gly 


Arg 


Gly 


Gin 


Lys 


Asn 












110 










115 













146 



PF-0459 US 



m 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 : 



Met 


Gin 


Thr 


Cys 


Pro 


Leu 


Ala 


Phe 


Pro 


Gly 


His 


Val 


Ser 


Gin 


Ala 








D 










i n 










15 


Leu 


Gly 


Thr 


Leu 


Leu 


Phe 


Leu 


Ala 


Ala 


Ser 


Leu 


Ser 


Ala 


Gin 


Asn 








20 










25 










30 


Glu 


Gly 


Trp 


Asp 


Ser 


Pro 


He 


Cys 


Thr 


Glu 


Gly 


Val 


Val 


Ser 


Val 










35 










40 










45 


Ser 


Trp 


Gly 


Glu 


Asn 


Thr 


Val 


Met 


Ser 


Cys 


Asn 


He 


Ser 


Asn 


Ala 








50 










55 










60 


Phe 


Ser 


His 


Val 


Asn 


He 


Lys 


Leu 


Arg 


Ala 


His 


Gly 


Gin 


Glu 


Ser 










65 










70 










75 


Ala 


lie 


Phe 


Asn 


blU 


val 


A_i_a 


Pro 


Gly 


Tyr 


irne 


oer 


Arg 


Asp 


o_l y 










80 










85 










90 


Trp 


Gin 


Leu 


Gin 


Val 


Gin 


Gly 


Gly Val 


Ala 


Gin 


Leu 


Val 


He 


Lys 








95 










100 










105 


Gly 


Ala 


Arg 


Asp 


Ser 


His 


Ala 


Gly 


Leu 


Tyr 


Met 


Trp 


His 


Leu 


Val 










110 










115 










120 


Gly 


His 


Gin 


Arg 


Asn 


Asn 


Arg 


Gin 


Val 


Thr 


Leu 


Glu 


Val 


Ser 


Gly 






125 










130 










135 


Ala 


Glu 


Pro 


Gin 


Ser 


Ala 


Pro 


Asp 


Thr 


Gly 


Phe 


Trp 


Pro 


Val 


Pro 










140 










145 










150 


Ala 


Val 


Val 


Thr 


Ala 


Val 


Phe 


He 


Leu 


Leu 


Val 


Ala 


Leu 


Val 


Met 










155 










160 










165 


Phe 


Ala 


Trp 


Tyr 


Arg 


Cys 


Arg 


Cys 


Ser 


Gin 


Gin 


Arg 


Arg 


Glu 


Lys 










170 










175 










180 


Lys 


Phe 


Phe 


Leu 


Leu 


Glu 


Pro 


Gin 


Met 


Lys 


Val 


Ala 


Ala 


Leu 


Arg 








185 










190 










195 


Ala 


Gly 


Ala 


Gin 


Gin 


Gly 


Leu 


Ser 


Arg 


Ala 


Ser 


Ala 


Glu 


Leu 


Trp 








200 










205 










210 


Thr 


Pro 


Asp 


Ser 


Glu 


Pro 


Thr 


Pro 


Arg 


Pro 


Leu 


Ala 


Leu 


Val 


Phe 










215 










220 










225 


Lys 


Pro 


Ser 


Pro 


Leu 


Gly 


Ala 


Leu 


Glu 


Leu 


Leu 


Ser 


Pro 


Gin 


Pro 








230 










235 










240 


Leu 


Phe 


Pro 


Tyr 


Ala 


Ala 


Asp 


Pro 

















245 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: STOMTUT02 

(B) CLONE: 1750632 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 : 



Met 


Leu 


Glu 


Glu 


Gly 


Ser 


Phe 


Arg 


Gly Arg 


Thr 


Ala 


Asp 


Phe 


Val 










5 










10 










15 


Phe 


Met 


Phe 


Leu 


Phe 


Gly 


Gly 


Val 


Leu 


Met 


Thr 


Val 


Ser 


Phe 


Pro 










20 










25 










30 


Gin 


Ala 


Leu 


Glu 


Pro 


Arg 


Ala 


Arg 


Ala 


Pro 


Arg 


Arg 


Pro 


Ala 


Cys 










35 










A 0 










45 


V d. J. 


o_l y 


ir J. u 




Ala 


Asn 


Thr 


Ala 


Met 


Pro 


Glu 


Arg 


Asp 


Thr 


Val 










50 










55 










60 


Ala 


Val 


Ser 


Ser 


Leu 


Ala 


Pro 


Phe 


Leu 


Pro 


Trp 


Ala 


Leu 


Met 


Gly 










65 










70 










75 


Phe 


Ser 


Leu 


Leu 


Leu 


Gly 


Asn 


Ser 


He 


Leu 


Val 


Asp 


Leu 


Leu 


Gly 










80 










85 










90 


He 


Ala 


Val 


Gly 


His 


He 


Tyr 


Tyr 


Phe 


Leu 


Glu 


Asp 


Val 


Phe 


Pro 










95 










100 










105 


Asn 


Gin 


Pro 


Gly 


Gly 


Lys 


Arg 


Leu 


Leu 


Gin 


Thr 


Pro 


Gly 


Phe 


Leu 










110 










115 










120 


Lys 


Leu 


Leu 


Leu 


Asp 


Ala 


Pro 


Ala 


Glu 


Asp 


Pro 


Asn 


Tyr 


Leu 


Pro 










125 










130 










135 


Leu 


Pro 


Glu 


Glu 


Gin 


Pro 


Gly 


Pro 


His 


Leu 


Pro 


Pro 


Pro 


Gin 


Gin 










140 










145 










150 


(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 


34 : 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 





(xi) l 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 


34 : 








Met 


Trp 


Ala 


Leu 


Gly 


Gin 


Ala 


Gly 


Phe 


Ala 


Asn 


Leu 


Thr 


Glu 


Gly 










5 










10 










15 


Leu 


Lys 


Val 


Trp 


Leu 


Gly 


He 


Met 


Leu 


Pro 


Val 


Leu 


Gly 


He 


Lys 










20 










25 










30 


Ser 


Leu 


Ser 


Pro 


Phe 


Ala 


He 


Thr 


Tyr 


Leu 


Asp 


Arg 


Leu 


Leu 


Leu 










35 










40 










45 


Met 


His 


Pro 


Asn 


Leu 


Thr 


Lys 


Gly 


Phe 


Gly 


Met 


He 


Gly 


Pro 


Lys 










50 










55 










60 


Asp 


Phe 


Phe 


Pro 


Leu 


Leu 


Asp 


Phe 


Ala 


Tyr 


Met 


Pro 


Asn 


Asn 


Ser 










65 










70 










75 


Leu 


Thr 


Pro 


Ser 


Leu 


Gin 


Glu 


Gin 


Leu 


Cys 


Gin 


Leu 


Tyr 


Pro 


Arg 










80 










85 










90 


Leu 


Lys 


Met 


Leu 


Ala 


Phe 


Gly 


Ala 


Lys 


Pro 


Asp 


Ser 


Thr 


Leu 


His 










95 










100 










105 


Thr 


Tyr 


Phe 


Pro 


Ser 


Phe 


Leu 


Ser 


Arg 


Ala 


Thr 


Pro 


Ser 


Cys 


Pro 










110 










115 










120 




148 



9 
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Pro Glu Met Lys Lys 
125 

Thr Val Asp Pro Leu 
140 

Lys His Leu Ser Gin 
155 

Ser Trp Glu Gin lie 
170 

Thr lie Gin Ser Leu 
185 

Gly Ser Ser Asn Asn 
200 

Lys Gly Leu Leu Gin 
215 

Arg Leu Leu Leu Leu 

230 

His Asp Leu Arg Ser 
245 

Arg Leu Leu Arg Ser 
260 

Cys Ala Lys Leu Tyr 
275 

Gly Glu Thr Leu Pro 
290 

Arg Pro Ser Leu Gin 
305 

Ser Phe Leu Ser Ala 
320 

Asp Ser Leu Thr Ser 
335 

Asp Ser Val Asn Gin 
350 

Leu Phe His Gin Asn 
365 

Glu Ala Leu Ala Trp 
380 

Gly Glu Val Thr Trp 
395 

Val His Trp Thr Trp 
410 

Leu Asp Trp Ala Leu 
425 



Glu Leu Leu Ser Ser 
130 

Ser Ala Ser Val Trp 

145 

Ser Ser Leu Leu Leu 
160 

Pro Lys Lys Val Gin 
175 

Lys Leu Thr Asn Gin 
190 

Gin Asp Val Val Thr 

205 

Gin Val Gin Gly Pro 
220 

Leu Leu Val Phe Ala 

235 

His Ser Ser Phe Gin 
250 

Ser Gly Phe Leu Pro 
265 

Ser Tyr Ser Leu Gin 
280 

Leu Trp Gly Ser His 
295 

Leu Ala Trp Ala His 
310 

His Cys Ala Ser His 
325 

Leu Ser Gin Arg Leu 
340 

Leu Leu Arg Tyr Leu 
355 

Val Leu Leu Pro Leu 
370 

Ala Gin Glu His Cys 
385 

Asp Cys Met Lys Thr 
400 

Leu Cys Leu Gin Asp 
415 

Ala Leu lie Ser Gin 

430 



Leu Thr Glu Cys Leu 
135 

Arg Gin Leu Tyr Pro 
150 

Glu His Leu Leu Ser 
165 

Lys Ser Leu Gin Glu 
180 

Glu Leu Leu Arg Lys 
195 

Cys Asp Met Ala Cys 
210 

Arg Leu Pro Trp Thr 
225 

Val Gly Phe Leu Cys 
240 

Ala Ser Leu Thr Gly 
255 

Ala Ser Gin Gin Ala 
270 

Gly Tyr Ser Trp Leu 
285 

Leu Leu Thr Val Val 
300 

Thr Asn Ala Thr Val 
315 

Leu Ala Trp Phe Gly 
330 

Gin lie Gin Leu Pro 
345 

Arg Glu Leu Pro Leu 
360 

Trp His Leu Leu Leu 
375 

His Glu Ala Cys Arg 
390 

Gin Leu Ser Glu Ala 
405 

He Thr Val Ala Phe 
420 

Gin 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT20 

(B) CLONE: 1818761 



149 
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(xi) SEQUENCE DESCRIPTION: 



Met 


Gin 


Trp 


Leu 


Arg 
5 


Val 


Arg 


Glu 


His 


Arg 


Val 


Thr 


Met 


Gly 


Thr 


Ala 










20 








Ala 


Leu 


Leu 


Leu 


Phe 


Leu 


Leu 


Met 










35 








Leu 


Thr 


Phe 


Asp 


Arg 


Ala 


Val 


Ala 










50 








Asp 


Ser 


Glu 


Asp 


Pro 


Leu 


Asp 


Pro 










65 








Ser 


Ser 


Gly 


Arg 


Pro 
80 


His 


Ala 


Leu 


Asn 


He 


Thr 


He 


Leu 


Lys 


Gly 


Asp 










95 








Gly 


Leu 


Pro 


Gly 


Tyr 


Met 


Gly 


Arg 










110 








Gly 


Pro 


Gin 


Gly 


Ser 


Lys 


Gly 


Asp 










125 








Gly 


Ala 


Pro 


Cys 


Gin 


Lys 


Arg 


Phe 










140 








Lys 


Thr 


Ala 


Leu 


His 


Ser 


Gly 


Glu 










155 








Glu 


Arg 


Val 


Phe 


Val 


Asn 


Leu 


Asp 










170 








Gly 


Gin 


Phe 


Ala 


Ala 


Pro 


Leu 


Arg 










185 








Asn 


Val 


His 


Ser 


Trp 


Asn 


Tyr 


Lys 










200 








His 


Asn 


Gin 


Lys 


Glu 


Ala 


Val 


He 










215 








Arg 


Ser 


He 


Met 


Gin 


Ser 


Gin 


Ser 










230 








Gly Asp 


Arg 


Val 


Trp 


Val 


Arg 


Leu 










245 








Ala 


He 


Tyr 


Ser 


Asn 


Asp 


Phe 


Asp 










260 








His 


Leu 


He 


Lys 


Ala 


Glu 


Asp 


Asp 










275 










SEQ ID NO: 35 : 



Ser 


Pro 


Gly 


Glu 


Ala 


Thr 


Gly 




10 










15 


Ala 


Leu 


Gly 


Pro 


Val 


Trp 


Ala 




25 










30 


Cys 


Glu 


He 


Pro 


Met 


Val 


Glu 




40 










45 


Ser 


Gly 


Cys 


Gin 


Arg 


Cys 


Cys 




55 










60 


Ala 


His 


Val 


Ser 


Ser 


Ala 


Ser 




70 










75 


Pro 


Glu 


He 


Arg 


Pro 


Tyr 


He 




85 










90 


Lys 


Gly 


Asp 


Pro 


Gly 


Pro 


Met 




100 










105 


Glu 


Gly 


Pro 


Gin 


Gly 


Glu 


Pro 




115 










120 


Lys 


Gly 


Glu 


Met 


Gly 


Ser 


Pro 




130 










135 


Phe 


Ala 


Phe 


Ser 


Val 


Gly 


Arg 




145 










150 


Asp 


Phe 


Gin 


Thr 


Leu 


Leu 


Phe 




160 










165 


Gly 


Cys 


Phe 


Asp 


Met 


Ala 


Thr 




175 










180 


Gly 


He 


Tyr 


Phe 


Phe 


Ser 


Leu 




190 










195 


Glu 


Thr 


Tyr 


Val 


His 


He 


Met 




205 










210 


Leu 


Tyr 


Ala 


Gin 


Pro 


Ser 


Glu 




220 










225 


Val 


Met 


Leu 


Asp 


Leu 


Ala 


Tyr 




235 










240 


Phe 


Lys 


Arg 


Gin 


Arg 


Glu 


Asn 




250 










255 


Thr 


Tyr 


He 


Thr 


Phe 


Ser 


Gly 




265 










270 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GBLATUT01 

(B) CLONE: 1824469 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 : 

Met Glu Glu Lys Arg Arg Arg Ala Arg Val Gin Gly Ala Trp Ala 
5 10 15 

Ala Pro Val Lys Ser Gin Ala He Ala Gin Pro Ala Thr Thr Ala 
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20 25 30 



Lys 


Ser 


His 


Leu 


His 


Gin 


Lys 


Pro 


Gly Gin 


Thr 


Trp 


Lys 


Asn 


Lys 










35 










40 










45 


Glu 


His 


His 


Leu 


Ser 


Asp 


Arg 


Glu 


Phe 


Val 


Phe 


Lys 


Glu 


Pro 


Gin 










50 










55 










60 


Gin 


Val 


Val 


Arg 


Arg 


Ala 


Pro 


Glu 


Pro 


Arg 


Val 


He 


Asp 


Arg 


Glu 










65 










70 










75 


Gly 


Val 


Tyr 


Glu 


He 


Ser 


Leu 


Ser 


Pro 


Thr 


Gly 


Val 


Ser 


Arg 


Val 










80 










85 










90 


Cys 


Leu 


Tyr 


Pro 


Gly 


Phe 


Val 


Asp 


Val 


Lys 


Glu 


Ala 


Asp 


Trp 


He 










95 










100 










105 


Leu 


Glu 


Gin 


Leu 


Cys 


Gin 


Asp 


Val 


Pro 


Trp 


Lys 


Gin 


Arg 


Thr 


Gly 










110 










115 










120 


He 


Arg 


Glu 


Asp 


He 


Thr 


Tyr 


Gin 


Gin 


Pro 


Arg 


Leu 


Thr 


Ala 


Trp 










125 










130 










135 


Tyr 


Gly 


Glu 


Leu 


Pro 


Tyr 


Thr 


Tyr 


Ser 


Arg 


He 


Thr 


Met 


Glu 


Pro 










140 










145 










150 


Asn 


Pro 


His 


Trp 


His 


Pro 


Val 


Leu 


Arg 


Thr 


Leu 


Lys 


Asn 


Arg 


He 










155 










160 










165 


Glu 


Glu 


Asn 


Thr 


Gly 


His 


Thr 


Phe 


Asn 


Ser 


Leu 


Leu 


Cys 


Asn 


Leu 










170 










175 










180 


Tyr 


Arg 


Asn 


Glu 


Lys 


Asp 


Ser 


Val 


Asp 


Trp 


His 


Ser 


Asp 


Asp 


Glu 










185 










190 










195 




Ser 


Leu 


Gly 


Arg 


Cys 


Pro 


He 


He 


Ala 


Ser 


Leu 


Ser 


Phe 


Gly 










200 










205 










210 


^ Ala 


Thr 


Arg 


Thr 


Phe 


Glu 


Met 


Arg 


Lys 


Lys 


Pro 


Pro 


Pro 


Glu 


Glu 










215 










220 










225 


f'y Asn 


Gly 


Asp 


Tyr 


Thr 


Tyr 


Val 


Glu 


Arg 


Val 


Lys 


He 


Pro 


Leu 


Asp 










230 










235 










240 


m His 


Gly 


Thr 


Leu 


Leu 


He 


Met 


Glu 


Gly Ala 


Thr 


Gin 


Ala 


Asp 


Trp 










245 










250 










255 


y§ Gin 


His 


Arg 


Val 


Pro 


Lys 


Glu 


Tyr 


His 


Ser 


Arg 


Glu 


Pro 


Arg 


Val 










260 










265 










270 


Asn 


Leu 


Thr 


Phe 


Arg 


Thr 


Val 


Tyr 


Pro 


Asp 


Pro 


Arg 


Gly Ala 


Pro 



275 280 285 

Trp 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 404 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT19 

(B) CLONE: 1864292 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 : 

Glu 
15 
Glu 
30 
Gly 
45 



151 



Met Lys Met Glu Glu 
5 

Ser Glu Ala Pro Pro 
20 

Glu Asp Gly Ser Val 
35 



Ala Val Gly Lys 
Lys Ala Ser Glu 
Glu Leu Glu Ser 



Val Glu Glu Leu He 
10 

Gin Glu Thr Ala Lys 
25 

Gin Val Gin Lys Asp 
40 





PF-0459 US 



Val 


Ala 


Asp 


Ser 


Thr 


Val 


He 


Ser 


Ser 


Met 


Pro 


Cys 


Leu 


Leu 


Met 








50 










55 










60 


Glu 


Leu 


Arg 


Arg 


Asp 


Ser 


Ser 


Glu 


Ser 


Gin 


Leu 


Ala 


Ser 


Thr 


Glu 








65 










70 










75 


Ser 


Asp 


Lys 


Pro 


Thr 


Thr 


Gly 


Arg 


Val 


Tyr 


Glu 


Ser 


Asp 


Pro 


Ser 






80 










85 










90 


Asn 


His 


Cys 


Met 


Leu 


Ser 


Pro 


Ser 


Ser 


Ser 


Gly 


His 


Leu 


Ala 


Asp 








95 










100 










105 


Ser 


Asp 


Thr 


Leu 


Ser 


Ser 


Ala 


Glu 


Glu 


Asn 


Glu 


Pro 


Ser 


Gin 


Ala 








110 










115 










120 


Glu 


Thr 


Ala 


Val 


Glu 


Gly Asp 


Pro 


Ser 


Gly Val 


Ser 


Gly 


Ala 


Thr 










125 










130 










135 


Val 


Gly 


Arg 


Lys 


Ser 


Arg 


Arg 


Ser 


Arg 


Ser 


Glu 


Ser 


Glu 


Thr 


Ser 








140 










145 










150 


Thr 


Met 


Ala 


Ala 


Lys 
155 


Lys 


Asn 


Arg 


Gin 


Ser 

160 


Ser 


Asp 


Lys 


Gin 


Asn 
165 


Gly 


Arg 


Val 


Ala 


Lys 


Val 


Lys 


Gly 


His 


Arg 


Ser 


Gin 


Lys 


His 


Lys 






170 










175 










180 


Glu 


Arg 


He 


Arg 


Leu 


Leu 


Arg 


Gin 


Lys 


Arg 


Glu 


Ala 


Ala 


Ala 


Arg 








185 










190 










195 


Lys 


Lys 


Tyr 


Asn 


Leu 


Leu 


Gin 


Asp 


Ser 


Ser 


Thr 


Ser 


Asp 


Ser 


Asp 




200 










205 










210 


Leu 


Thr 


Cys 


Asp 


Ser 


Ser 


Thr 


Ser 


Ser 


Ser 


Asp 


Asp 


Asp 


Glu 


Glu 








215 










220 










225 


Val 


Ser 


Gly 


Ser 


Ser 


Lys 


Thr 


He 


Thr 


Ala 


Glu 


He 


Pro 


Asp 


Gly 








230 










235 










240 


Pro 


Pro 


Val 


Val 


Ala 
245 


His 


Tyr 


Asp 


Met 


Ser 

250 


Asp 


Thr 


Asn 


Ser 


Asp 
255 


Pro 


Glu 


Val 


Val 


Asn 
260 


Val 


Asp 


Asn 


Leu 


Leu 
265 


Ala 


Ala 


Ala 


Val 


Val 
270 


Gin 


Glu 


His 


Ser 


Asn 
275 


Ser 


Val 


Gly 


Gly 


Gin 

280 


Asp 


Thr 


Gly 


Ala 


Thr 
285 


Trp 


Arg 


Thr 


Ser 


Gly 


Leu 


Leu 


Glu 


Glu 


Leu 


Asn 


Ala 


Glu 


Ala 


Gly 






290 










295 










300 


His 


Leu 


Asp 


Pro 


Gly 


Phe 


Leu 


Ala 


Ser 


Asp 


Lys 


Thr 


Ser 


Ala 


Gly 








305 










310 










315 


Asn 


Ala 


Pro 


Leu 


Asn 
320 


Glu 


Glu 


He 


Asn 


He 
325 


Ala 


Ser 


Ser 


Asp 


Ser 
330 


Glu 


Val 


Glu 


He 


Val 

335 


Gly 


Val 


Gin 


Glu 


His 
340 


Ala 


Arg 


Cys 


Val 


His 
345 


Pro 


Arg 


Gly 


Gly 


Val 


He 


Gin 


Ser 


Val 


Ser 


Ser 


Trp 


Lys 


His 


Gly 








350 










355 










360 


Ser 


Gly 


Thr 


Gin 


Tyr 


Val 


Ser 


Thr 


Arg 


Gin 


Thr 


Gin 


Ser 


Trp 


Thr 








365 










370 










375 


Ala 


Val 


Thr 


Pro 


Gin 

380 


Gin 


Thr 


Trp 


Ala 


Ser 
385 


Pro 


Ala 


Glu 


Val 


Val 
390 


Asp 


Leu 


Thr 


Leu 


Asp 


Glu 


Asp 


Ser 


Arg 


Arg 


Lys 


Tyr 


Leu 


Leu 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 405 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 



395 



400 



(2) INFORMATION FOR SEQ ID NO: 



38 : 
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(A) LIBRARY: THP1NOT01 

(B) CLONE: 1866437 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 : 



Met 


Phe 


Val 


Gin 


Glu 

5 


Glu 


Lys 


He 


Phe 


Ala 
10 


Gly 


Lys 


Val 


Leu 


Arg 
15 


Leu 


His 


He 


Cys 


Ala 


Ser 


Asp 


Gly 


Ala 


Glu 


Trp 


Leu 


Glu 


Glu 


Ala 








20 










25 










30 


Thr 


Glu 


Asp 


Thr 


Ser 


Val 


Glu 


Lys 


Leu 


Lys 


Glu 


Arg 


Cys 


Leu 


Lys 








35 










40 










45 


His 


Cys 


Ala 


His 


Gly 


Ser 


Leu 


Glu 


Asp 


Pro 


Lys 


Ser 


He 


Thr 


His 








50 










55 










60 


His 


Lys 


Leu 


He 


His 


Ala 


Ala 


Ser 


Glu 


Arg 


Val 


Leu 


Ser 


Asp 


Ala 








65 










70 










75 


Arg 


Thr 


He 


Leu 


Glu 


Glu 


Asn 


He 


Gin 


Asp 


Gin 


Asp 


Val 


Leu 


Leu 








80 










85 










90 


Leu 


Lys 


Lys 


Lys 


Arg 


Ala 


Pro 


Ser 


Pro 


Leu 


Pro 


Lys 


Met 


Ala 


Asp 






95 










100 










105 


Val 


Ser 


Ala 


Glu 


Glu 


Lys 


Lys 


Lys 


Gin 


Asp 


Gin 


Lys 


Ala 


Pro 


Asp 










110 








115 










120 


Lys 


Glu 


Ala 


He 


Leu 


Arg 


Ala 


Thr 


Ala 


Asn 


Leu 


Pro 


Ser 


Tyr 


Asn 








125 










130 










135 


Met 


Asp 


Arg 


Ala 


Ala 


Val 


Gin 


Thr 


Asn 


Met 


Arg 


Asp 


Phe 


Gin 


Thr 






140 










145 










150 


Glu 


Leu 


Arg 


Lys 


He 


Leu 


Val 


Ser 


Leu 


He 


Glu 


Val 


Ala 


Gin 


Lys 






155 










160 










165 


Leu 


Leu 


Ala 


Leu 


Asn 


Pro 


Asp 


Ala 


Val 


Glu 


Leu 


Phe 


Lys 


Lys 


Ala 










170 








175 










180 


Asn 


Ala 


Met 


Leu 


Asp 
185 


Glu 


Asp 


Glu 


Asp 


Glu 
190 


Arg 


Val 


Asp 


Glu 


Ala 
195 


Ala 


Leu 


Arg 


Gin 


Leu 


Thr 


Glu 


Met 


Gly 


Phe 


Pro 


Glu 


Asn 


Arg 


Ala 








200 










205 










210 


Thr 


Lys 


Ala 


Leu 


Gin 


Leu 


Asn 


His 


Met 


Ser 


Val 


Pro 


Gin 


Ala 


Met 








215 










220 










225 


Glu 


Trp 


Leu 


He 


Glu 


His 


Ala 


Glu 


Asp 


Pro 


Thr 


He 


Asp 


Thr 


Pro 








230 










235 










240 


Leu 


Pro 


Gly 


Gin 


Ala 


Pro 


Pro 


Glu 


Ala 


Glu 


Gly 


Ala 


Thr 


Ala 


Ala 








245 










250 










255 


Ala 


Ser 


Glu 


Ala 


Ala 
260 


Ala 


Gly 


Ala 


Ser 


Ala 

265 


Thr 


Asp 


Glu 


Glu 


Ala 
270 


Arg 


Asp 


Glu 


Leu 


Thr 


Glu 


He 


Phe 


Lys 


Lys 


He 


Arg 


Arg 


Lys 


Arg 






275 










280 










285 


Glu 


Phe 


Arg 


Ala 


Asp 


Ala 


Arg 


Ala 


Val 


He 


Ser 


Leu 


Met 


Glu 


Met 








290 










295 










300 


Gly 


Phe 


Asp 


Glu 


Lys 


Glu 


Val 


He 


Asp 


Ala 


Leu 


Arg 


Val 


Asn 


Asn 






305 










310 










315 


Asn 


Gin 


Gin 


Asn 


Ala 


Ala 


Cys 


Glu 


Trp 


Leu 


Leu 


Gly 


Asp 


Arg 


Lys 










320 








325 










330 


Pro 


Ser 


Pro 


Glu 


Glu 
335 


Leu 


Asp 


Lys 


Gly 


He 
340 


Asp 


Pro 


Asp 


Ser 


Pro 
345 


Leu 


Phe 


Gin 


Ala 


He 


Leu 


Asp 


Asn 


Pro 


Val 


Val 


Gin 


Leu 


Gly 


Leu 










350 








355 










360 


Thr 


Asn 


Pro 


Lys 


Thr 


Leu 


Leu 


Ala 


Phe 


Glu 


Asp 


Met 


Leu 


Glu 


Asn 








365 










370 










375 


Pro 


Leu 


Asn 


Ser 


Thr 


Gin 


Trp 


Met 


Asn 


Asp 


Pro 


Glu 


Thr 


Gly 


Pro 










380 








385 










390 


Val 


Met 


Leu 


Gin 


He 
395 


Ser 


Arg 


He 


Phe 


Gin 
400 


Thr 


Leu 


Asn 


Arg 


Thr 
405 



153 



PF-0459 US 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 177 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SKINBIT01 

(B) CLONE: 1871375 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 : 



Met 


Val 


Met 


His 


Asn 


Ser 


Asp 


Pro 


Asn 


Leu 


His 


Leu 


Leu 


Ala 


Glu 










5 








10 










15 


Gly 


Ala 


Pro 


He 


Asp 


Trp 


Gly 


Glu 


Glu 


Tyr 


Ser 


Asn 


Ser 


Gly 


Gly 








20 










25 










30 


Gly 


Gly 


Ser 


Pro 


Ala 


Pro 


Ala 


Pro 


Arg 


Ser 


Gin 


Pro 


Pro 


Ser 


Arg 








35 










40 










45 


Lys 


Ser 


Asp 


Gly 


Ala 
50 


Pro 


Ser 


Arg 


Trp 


Ser 
55 


Leu 


Trp 


Ser 


Arg 


Met 
60 


Arg 


Arg 


Trp 


Gly 


Cys 
65 


Pro 


Leu 


Arg 


Leu 


Ala 
70 


Leu 


Ser 


His 


His 


His 
75 


Leu 


Arg 


Pro 


Arg 


Thr 

80 


Val 


Ser 


Leu 


Arg 


Ser 
85 


Glu 


Ala 


Cys 


Trp 


Pro 
90 


Lys 


Val 


Cys 


Gly 


Leu 


Arg 


Ala 


Pro 


His 


Gin 


Pro 


Ala 


Pro 


Cys 


Ser 








95 










100 










105 


Thr 


Gly 


Pro 


Pro 


Leu 
110 


Gly 


Arg 


Val 


Pro 


Ser 
115 


Leu 


Arg 


Pro 


Pro 


Pro 
120 


Arg 


Pro 


Pro 


Arg 


Arg 
125 


Leu 


Pro 


His 


Pro 


Ser 
130 


Ser 


He 


Ser 


Cys 


Leu 
135 


Glu 


Arg 


Leu 


Trp 


Thr 
140 


Leu 


Gly 


Pro 


Pro 


Ser 
145 


Pro 


Ala 


Thr 


Arg 


Arg 
150 


Leu 


Glu 


Ser 


Arg 


Cys 
155 


Pro 


Ala 


Pro 


Ala 


Ala 
160 


Thr 


Pro 


Pro 


Ser 


Thr 
165 


Pro 


Pro 


Pro 


Arg 


Xaa 


Xaa 


Phe 


Lys 


Gly 


Cys 


Lys 


Asn 









170 175 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 197 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LEUKNOT03 

(B) CLONE: 1880830 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 : 

Met He Thr Cys Arg Val Cys Gin Ser Leu He Asn Val Glu Gly 
5 10 15 

Lys Met His Gin His Val Val Lys Cys Gly Val Cys Asn Glu Ala 
20 25 30 



154 
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US 












Thr 


Pro 


He 


Lys 


Asn 


Ala 


Pro 


Pro 










35 








Pro 


Cys 


Asn 


Cys 


Leu 


Leu 


He 


Cys 








50 








Ala 


Cys 


Pro 


Arg 


Pro 
65 


Tyr 


Cys 


Lys 


val 


rilS 


Pro 


Gly 


Pro 
80 


Leu 


Ser 


Pro 


Arg 


val 


He 


Cys 


Gly 
95 


His 


Cys 


Lys 


Phe 


Thr 


Asp 


Arg 


Thr 


Leu 


Ala 


Arg 










110 








Ser 


Ser 


He 


Gly 


Arg 


Arg 


Tyr 


Pro 










125 








rne 


Leu 


Leu 


Gly 


Leu 


Leu 


Leu 


Ala 










140 








Phe 


Gly 


Thr 


Trp 


Lys 


His 


Ala 


Arg 










155 








Ala 


Trp 


Ala 


Phe 


Val 


He 


Leu 


Leu 










170 








Ala 


Leu 


Tyr 


Trp 


Ala 


Cys 


Met 


Lys 










185 








Phe 


Ser 
















Gly 


Lys 
40 


Lys 


Tyr 


Val 


Arg 


Cys 
45 


Lys 


Val 


Thr 


Ser 


Gin 


Arg 


He 


55 










60 


Arg 


He 


He 


Asn 


Leu 


Gly 


Pro 


70 










75 


Glu 


Pro 
85 


Gin 


Pro 


Met 


Gly 


Val 
90 


Asn 


Thr 
100 


Phe 


Leu 


Trp 


Thr 


Glu 
105 


Cys 


Pro 
115 


His 


Cys 


Arg 


Lys 


Val 
120 


Arg 


Lys 
130 


Arg 


Cys 


He 


Cys 


Cys 
135 


Val 


Thr 
145 


Ala 


Thr 


Gly 


Leu 


Ala 
150 


Arg 


Tyr 
160 


Gly 


Gly 


He 


Tyr 


Ala 
165 


Ala 


Val 
175 


Leu 


Cys 


Leu 


Gly 


Arg 
180 


Val 


Ser 
190 


His 


Pro 


Val 


Gin 


Asn 
195 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARNOT07 

(B) CLONE: 1905325 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 : 



Met 


Leu 


Lys 


Asp 


He 


He 


Lys 


Glu 


Tyr 


Thr 


Asp 


Val 


Tyr 


Pro 


Glu 






5 










10 










15 


He 


He 


Glu 


Arg 


Ala 


Gly 


Tyr 


Ser 


Leu 


Glu 


Lys 


Val 


Phe 


Gly 


He 








20 










25 










30 


Gin 


Leu 


Lys 


Glu 


He 


Asp 


Lys 


Asn 


Asp 


His 


Leu 


Tyr 


He 


Leu 


Leu 








35 










40 










45 


Ser 


Thr 


Leu 


Glu 


Pro 
50 


Thr 


Asp 


Ala 


Gly 


He 
55 


Leu 


Gly 


Thr 


Thr 


Lys 
60 


Asp 


Ser 


Pro 


Lys 


Leu 


Gly 


Leu 


Leu 


Met 


Val 


Leu 


Leu 


Ser 


He 


He 






65 










70 










75 


Phe 


Met 


Asn 


Gly 


Asn 


Arg 


Ser 


Ser 


Glu 


Ala 


Val 


He 


Trp 


Glu 


Val 








80 










85 










90 


Leu 


Arg 


Lys 


Leu 


Gly 


Leu 


Arg 


Pro 


Gly 


He 


His 


His 


Ser 


Leu 


Phe 






95 










100 










105 


Gly Asp 


Val 


Lys 


Lys 


Leu 


He 


Thr 


Asp 


Glu 


Phe 


Val 


Lys 


Gin 


Lys 










110 










115 










120 


Tyr 


Leu 


Asp 


Tyr 


Ala 


Arg 


Val 


Pro 


Asn 


Ser 


Asn 


Pro 


Pro 


Glu 


Tyr 




125 










130 










135 


Glu 


Phe 


Phe 


Trp 


Gly 


Leu 


Arg 


Ser 


Tyr 


Tyr 


Glu 


Thr 


Ser 


Lys 


Met 








140 










145 










150 


Lys 


Val 


Leu 


Lys 


Phe 


Ala 


Cys 


Lys 


Val 


Gin 


Lys 


Lys 


Asp 


Pro 


Lys 



155 




PF-0459 US 











155 






Glu 


Trp 


Ala 


Ala 


Gin 
170 


Tyr 


Arg 


Ala 


Ala 


Ala 


Glu 


Ala 
185 


Ala 


Ala 


Arg 


Ala 


Arg 


Met 


Gly 
200 


He 


Gly 


Pro 


Cys 


Asn 


Trp 


Asp 
215 


Glu 


Ala 


Arg 


He 


Gin 


Ala 


Gly 
230 


Ala 


Glu 


Gly 


Ser 


Ala 


Ser 


Thr 
245 


Gly 


Ala 


Ser 


Ala 


Ser 


Ala 


Ser 
260 


Thr 


Ser 


Leu 


Thr 


Ala 


Thr 


Leu 
275 


Thr 


Phe 


Ala 


Gly 


Ala 


Ser 


Thr 
290 


Ser 


Gly 


Tyr 


Lys 


















160 










165 


Glu 


Ala 


Met 
175 


Glu 


Ala 


Asp 


Leu 


Lys 
180 


Glu 


Ala 


Lys 
190 


Ala 


Arg 


Ala 


Glu 


He 
195 


Leu 


Gly 


Ser 
205 


Glu 


Asn 


Ala 


Ala 


Gly 
zlU 


Asp 


He 


Gly 
220 


Pro 


Trp 


Ala 


Lys 


Ala 
225 


Ala 


Lys 


Ala 
235 


Lys 


Ala 


Gin 


Glu 


Ser 
240 


Ser 


Thr 


Ser 
250 


Thr 


Asn 


Asn 


Ser 


Ala 
255 


Gly 


Gly 


Phe 


Ser 


Ala 


Gly Ala 


Ser 






265 










270 


Gly 


Leu 


Phe 
280 


Ala 


Gly 


Leu 


Gly 


Gly 
285 


Ser 


Ser 


Gly 
295 


Ala 


Cys 


Gly 


Phe 


Ser 
300 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 164 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTTUT01 

(B) CLONE: 1919931 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 : 



Met 


Arg 


Thr 


Leu 


Glu 


Asn 


Gin 


Gly 


Phe 


Lys 


He 


Leu 


Pro 


Phe 


Leu 








5 










10 










15 


Gly 


Val 


Lys 


Glu 


Val 


Trp 


Gin 


Lys 


Gin 


Asn 


Lys 


Leu 


He 


Ser 


Arg 






20 










25 










30 


Phe 


He 


Thr 


Cys 


Gin 


Phe 


Phe 


Leu 


Tyr 


Asn 


Phe 


Leu 


Asp 


Ser 


Gly 








35 










40 










45 


Ser 


He 


Trp 


Val 


Gin 


Ala 


Asp 


Phe 


Pro 


Pro 


He 


Leu 


Gin 


Cys 


Gly 








50 










55 










60 


Cys 


Phe 


Leu 


Phe 


His 


Pro 


Trp 


Thr 


Leu 


Gin 


Glu 


He 


Ala 


Pro 


Cys 








65 










70 










75 


Phe 


Cys 


Leu 


Cys 


He 


Thr 


Glu 


Lys 


Gly 


Ser 


Met 


Lys 


Val 


Ala 


Gin 






80 










85 










90 


Val 


Arg 


Pro 


Phe 


His 


Cys 


Pro 


Pro 


Gly 


Ala 


Gly 


Phe 


Ala 


Leu 


Pro 








95 








100 










105 


He 


Leu 


Gly 


Leu 


Leu 


Gin 


Gly 


Leu 


Val 


He 


Leu 


His 


Ser 


Pro 


Leu 








110 










115 










120 


His 


He 


Ser 


Gin 


Val 


Ser 


Ala 


Gin 


Lys 


Ser 


Pro 


Phe 


Gly 


Gly 


Val 










125 










130 










135 


Ser 


Thr 


Cys 


His 


Cys 


Val 


l Cys 


Lys 


Ser 


Ser 


Phe 


Ser 


Phe 


Phe 


Leu 








140 










145 










150 


Ala 


His 


Leu 


Thr 


Leu 


Val 


Met 


Ser 


Leu 


He 


Thr 


Thr 


Thr 


He 





155 160 



156 



PF-0459 US 











155 








Glu 


Trp 


Ala 


Ala 


Gin 
170 


Tyr 


Arg 


Glu 


Ala 


Ala 


Ala 


Glu 


Ala 
185 


Ala 


Ala 


Glu 


Arg 


Ala 


Arg 


Met 


Gly 


He 


Gly 


Leu 








200 








Pro 


Cys 


Asn 


Trp 


Asp 


Glu 


Ala 


Asp 








215 








Arg 


He 


Gin 


Ala 


Gly 


Ala 


Glu 


Ala 








230 








Gly 


Ser 


Ala 


Ser 


Thr 


Gly 


Ala 


Ser 








245 








Ser 


Ala 


Ser 


Ala 


Ser 
260 


Thr 


Ser 


Gly 


Leu 


Thr 


Ala 


Thr 


Leu 
275 


Thr 


Phe 


Gly 


Ala 


Gly 


Ala 


Ser 


Thr 

290 


Ser 


Gly 


Ser 


Tyr 


Lys 

















160 










165 


Ala 


Met 


Glu 


Ala 


Asp 


Leu 


Lys 




175 










180 


Ala 


Lys 


Ala 


Arg 


Ala 


Glu 


He 




190 










195 


Gly 


Ser 


Glu 


Asn 


Ala 


Ala 


Gly 


205 










210 


He 


Gly 


Pro 


Trp 


Ala 


Lys 


Ala 




220 










225 


Lys 


Ala 


Lys 


Ala 


Gin 


Glu 


Ser 


235 








240 


Thr 


Ser 


Thr 


Asn 


Asn 


Ser 


Ala 




250 










255 


Gly 


Phe 


Ser 


Ala 


Gly 


Ala 


Ser 


265 










270 


Leu 


Phe 


Ala 


Gly 


Leu 


Gly 


Gly 




280 










285 


Ser 


Gly 


Ala 


Cys 


Gly 


Phe 


Ser 




295 










300 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 164 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTTUT01 

(B) CLONE: 1919931 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 : 



Met 


Arg 


Thr 


Leu 


Glu 


Asn 


Gin 


Gly 


Phe 


Lys 


He 


Leu 


Pro 


Phe 


Leu 








5 










10 










15 


Glv Val 


Lys 


Glu 


Val 


Trp 


Gin 


Lys 


Gin 


Asn 


Lys 


Leu 


He 


Ser 


Arg 








20 










25 










30 


Phe 


He 


Thr 


Cys 


Gin 


Phe 


Phe 


Leu 


Tyr 


Asn 


Phe 


Leu 


Asp 


Ser 


Gly 








35 










40 










45 


Ser 


He 


Trp 


Val 


Gin 


Ala 


Asp 


Phe 


Pro 


Pro 


He 


Leu 


Gin 


Cys 


Gly 








50 










55 










60 


Cys 


Phe 


Leu 


Phe 


His 


Pro 


Trp 


Thr 


Leu 


Gin 


Glu 


He 


Ala 


Pro 


Cys 








65 










70 










75 


Phe 


Cys 


Leu 


Cys 


He 


Thr 


Glu 


Lys 


Gly 


Ser 


Met 


Lys 


Val 


Ala 


Gin 






80 










85 










90 


Val 


Arg 


Pro 


Phe 


His 


Cys 


Pro 


Pro 


Gly 


Ala 


Gly 


Phe 


Ala 


Leu 


Pro 








95 










100 










105 


He 


Leu 


Gly 


Leu 


Leu 


Gin 


Gly 


Leu 


Val 


He 


Leu 


His 


Ser 


Pro 


Leu 








110 










115 










120 


His 


He 


Ser 


Gin 


Val 


Ser 


Ala 


Gin 


Lys 


Ser 


Pro 


Phe 


Gly 


Gly 


Val 








125 










130 










135 


Ser 


Thr 


Cys 


His 


c y s 


Val 


Cys 


Lys 


Ser 


Ser 


Phe 


Ser 


Phe 


Phe 


Leu 








140 










145 










150 


Ala 


His 


Leu 


Thr 


Leu 
155 


Val 


Met 


Ser 


Leu 


He 
160 


Thr 


Thr 


Thr 


He 





156 



• 



PF-0459 US 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT04 

(B) CLONE: 1969426 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 : 



Met 


Ser 


Pro 


Thr 


Leu 


Ser 


Ser 


He 


Thr 


Gin 


Gly 


Val 


Pro 


Leu 


Asp 








5 










10 










15 


Thr 


Ser 


Lys 


Leu 


Ser 


Thr 


Asp 


Gin 


Arg 


Leu 


Pro 


Pro 


Tyr 


Pro 


Tyr 








20 










25 










30 


Ser 


Ser 


Pro 


Ser 


Leu 


Val 


Leu 


Pro 


Thr 


Gin 


Pro 


His 


Thr 


Pro 


Lys 










35 










40 










45 


Ser 


Leu 


Gin 


Gin 


Pro 


Gly 


Leu 


Pro 


Ser 


Gin 


Ser 


Cys 


Ser 


Val 


Gin 










50 








55 










60 


Ser 


Ser 


Gly 


Gly 


Gin 


Pro 


Pro 


Gly Arg 


Gin 


Ser 


His 


Tyr 


Gly 


Thr 






65 










70 










75 


Pro 


Tyr 


Pro 


Pro 


Gly 


Pro 


Ser 


Gly 


His 


Gly 


Gin 


Gin 


Ser 


Tyr 


His 








80 










85 










90 


Arg 


Pro 


Met 


Ser 


Asp 


Phe 


Asn 


Leu 


Gly 


Asn 


Leu 


Glu 


Gin 


Phe 


Ser 








95 










100 










105 


Met 


Glu 


Ser 


Pro 


Ser 


Ala 


Ser 


Leu 


Val 


Leu 


Asp 


Pro 


Pro 


Gly 


Phe 










110 










115 










120 


Ser 


Glu 


Gly 


Pro 


Gly 


Phe 


Leu 


Gly 


Gly 


Glu 


Gly 


Pro 


Met 


Gly 


Gly 








125 










130 










135 


Pro 


Gin 


Asp 


Pro 


His 


Thr 


Phe 


Asn 


His 


Gin 


Asn 


Leu 


Thr 


His 


Cys 








140 










145 










150 


Ser 


Arg 


His 


Gly 


Ser 


Gly 


Pro 


Asn 


He 


He 


Leu 


Thr 


Gly 


Asp 


Ser 






155 










160 










165 


Ser 


Pro 


Gly 


Phe 


Ser 


Lys 


Glu 


He 


Ala 


Ala 


Ala 


Leu 


Ala 


Gly 


Val 








170 










175 










180 


Pro 


Gly 


Phe 


Glu 


Val 


Ser 


Ala 


Ala 


Gly 


Leu 


Glu 


Leu 


Gly 


Leu 


Gly 








185 










190 










195 


Leu 


Glu 


Asp 


Glu 


Leu 


Arg 


Met 


Glu 


Pro 


Leu 


Gly 


Leu 


Glu 


Gly 


Leu 








200 










205 










210 


Asn 


Met 


Leu 


Ser 


Asp 


Pro 


Cys 


Ala 


Leu 


Leu 


Pro 


Asp 


Pro 


Ala 


Val 










215 










220 










225 


Glu 


Glu 


Ser 


Phe 


Arg 


Ser 


Asp 


Arg 


Leu 


Gin 




















230 










235 












(2) 


INFORMATION 


FOR 


SEQ 


ID 


NO: 




44: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UCMCL5T01 



157 




PF-0459 US 

(B) CLONE: 1969948 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 : 

Met Asn Tyr Phe Pro Leu Ala Pro Phe Asn Gin Leu Leu Gin Lys 

5 10 15 

Asp lie lie Ser Glu Leu Leu Thr Ser Asp Asp Met Lys Asn Ala 

20 25 30 

Tyr Lys Leu His Thr Leu Asp Thr Cys Leu Lys Leu Asp Asp Thr 

35 40 45 

Val Tyr Leu Arg Asp He Ala Leu Ser Leu Pro Gin Leu Pro Arg 

50 55 60 

Glu Leu Pro Ser Ser His Thr Asn Ala Lys Val Ala Glu Val Leu 

65 70 75 

Ser Ser Leu Leu Gly Gly Glu Gly His Phe Ser Lys Asp Val His 

80 85 90 

Leu Pro His Asn Tyr His He Asp Phe Glu He Arg Met Asp Thr 

95 100 105 

Asn Arg Asn Gin Val Leu Pro Leu Ser Asp Val Asp Thr Thr Ser 

110 115 120 

Ala Thr Asp He Gin Arg Val Ala Val Leu Cys Val Ser Arg Ser 

125 130 135 

Ala Tyr Cys Leu Gly Ser Ser His Pro Arg Gly Phe Leu Ala Met 

140 145 150 
Lys Met Arg His Leu Asn Ala Met Gly Phe His Val He Leu Val 

155 160 165 
Asn Asn Trp Glu Met Asp Lys Leu Glu Met Glu Asp Ala Val Thr 

170 175 180 
Phe Leu Lys Thr Lys He Tyr Ser Val Glu Ala Leu Pro Val Ala 

185 190 195 
Ala Val Asn Val Gin Ser Thr Gin 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGAST01 

(B) CLONE: 1988911 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 : 



Met 


Glu 


Arg 


Gly 


Asn 


Val 


Leu 


Ser 


Arg 


Ala 


Pro 


Ser 


Arg 


Ala 


His 






5 










10 










15 


Gly 


Thr 


His 


Phe 


Gly 


Asp 


Asp 


Arg 


Phe 


Glu 


Asp 


Leu 


Glu 


Glu 


Ala 








20 










25 










30 


Asn 


Pro 


Phe 


Ser 


Phe 


Arg 


Glu 


Phe 


Leu 


Lys 


Thr 


Lys 


Asn 


Leu 


Gly 










35 








40 










45 


Leu 


Ser 


Lys 


Glu 


Asp 


Pro 


Ala 


Ser 


Arg 


He 


Tyr 


Ala 


Lys 


Glu 


Ala 








50 










55 










60 


Ser 


Arg 


His 


Ser 


Leu 


Gly 


Leu 


Asp 


His 


Asn 


Ser 


Pro 


Pro 


Ser 


Gin 








65 










70 










75 


Thr 


Gly 


Gly 


Tyr 


Gly 


Leu 


Glu 


Tyr 


Gin 


Gin 


Pro 


Phe 


Phe 


Glu 


Asp 



200 



(2) INFORMATION 



FOR SEQ ID NO: 



45: 



158 





PF-0459 



US 



80 



85 



90 



Pro 


Thr 


Gly Ala 


Gly 


Asp 


Leu 


Leu 


Asp 


Glu 


Glu 


Glu 


Asp 


Glu 


Asp 










95 










100 










105 


Thr 


Gly 


Trp 


Ser 


Gly Ala 


Tyr 


Leu 


Pro 


Ser 


Ala 


He 


Glu 


Gin 


Thr 










110 










115 










120 


His 


Pro 


Glu 


Arg 


Val 


Pro 


Ala 


Gly 


Thr 


Ser 


Pro 


Cys 


Ser 


Thr 


Tyr 








125 










130 










135 


Leu 


Ser 


Phe 


Phe 


Ser 


Thr 


Pro 


Ser 


Glu 


Leu 


Ala 


Gly 


Pro 


Glu 


Ser 










140 










145 










150 


Leu 


Pro 


Ser 


Trp 


Ala 


Leu 


Ser 


Asp 


Thr 


Asp 


Ser 


Arg 


Val 


Ser 


Pro 










155 










160 










165 


Ala 


Ser 


Pro 


Ala 


Gly 


Ser 


Pro 


Ser 


Ala 


Asp 


Phe 


Ala 


Val 


His 


Gly 










170 










175 










180 


Glu 


Ser 


Leu 


Gly 


Asp 


Arg 


His 


Leu 


Arg 


Thr 


Leu 


Gin 


He 


Ser 


Tyr 










185 










190 










195 


Asp 


Ala 


Leu 


Lys 


Asp 


Glu 


Asn 


Ser 


Lys 


Leu 


Arg 


Arg 


Lys 


Leu 


Asn 










200 










205 










210 


Glu 


Val 


Gin 


Ser 


Phe 


Ser 


Glu 


Ala 


Gin 


Thr 


Glu 


Met 


Val 


Arg 


Thr 










215 










220 










225 


Leu 


Glu 


Arg 


Lys 


Leu 


Glu 


Ala 


Lys 


Met 


He 


Lys 


Glu 


Glu 


Ser 


Asp 








230 










235 










240 


Tyr 


His 


Asp 


Leu 


Glu 


Ser 


Val 


Val 


Gin 


Gin 


Val 


Glu 


Gin 


Asn 


Leu 






245 










250 










255 


Glu 


Leu 


Met 


Thr 


Lys 


Arg 


Ala 


Val 


Lys 


Ala 


Glu 


Asn 


His 


Val 


Val 










260 










265 










270 


Lys 


Leu 


Lys 


Gin 


Glu 


He 


Ser 


Leu 


Leu 


Gin 


Ala 


Gin 


Val 


Ser 


Asn 






275 










280 










285 


Phe 


Gin 


Arg 


Glu 


Asn 


Glu 


Ala 


Leu 


Arg 


Cys 


Gly 


Gin 


Gly Ala 


Ser 










290 










295 










300 


Leu 


Thr 


Val 


Val 


Lys 


Gin 


Asn 


Ala 


Asp 


Val 


Ala 


Leu 


Gin 


Asn 


Leu 










305 










310 










315 


Arg 


Val 


Val 


Met 


Asn 


Ser 


Ala 


Gin 


Ala 


Ser 


He 


Lys 


Gin 


Leu 


Val 








320 










325 










330 


Ser 


Gly 


Ala 


Glu 


Thr 


Leu 


Asn 


Leu 


Val 


Ala 


Glu 


He 


Leu 


Lys 


Ser 








335 










340 










345 


lie 


Asp 


Arg 


lie 


Ser 


Glu 


Val 


Lys 


Asp 


Glu 


Glu 


Glu 


Asp 


Ser 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARNOT 0 3 

(B) CLONE: 2061561 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 : 

Met Gly Gly Lys Pro His Lys Glu Pro Arg Ala Lys Gly Pro Leu 
5 10 15 

Ser He Phe Tyr Pro Gly Ser Thr Ala Pro Val He Thr Gin Arg 
20 25 30 

Thr Pro Xaa Ala Ala Leu Lys Pro Pro Pro He Lys Gly Ala Gly 



350 



355 



(2) INFORMATION FOR SEQ ID NO: 



46: 



35 



40 



45 



159 



PF-0459 


US 


























Pro 


Thr 


He 


Ala 


Pro 
50 


He 


Lys 


Gly 


Xaa 


Xaa 
55 


Asn 


Phe 


Gly 


Lys 


Arg 
60 


Pro 


Thr 


Val 


Thr 


Xaa 
65 


Pro 


Xaa 


Trp 


Xaa 


He 
70 


Ser 


Pro 


Asn 


Trp 


Gly 
75 


Lys 


Arg 


Gly 


Xaa 


Cys 
80 


Xaa 


Xaa 


Xaa 


Gly 


He 
85 


Lys 


Trp 


Val 


Xaa 


Pro 
90 


Arg 


V 3.± 


Ser 


Gin 


Ala 
95 


Arg 


Thr 


Phe 


Lys 


Thr 
100 


Thr 


Ala 


Asn 


Glu 


Leu 
105 


Xaa 


Phe 


Xaa 


Asp 


Thr 
110 


Phe 


Glu 


Glu 


Xaa 


Xaa 
115 


Arg 


Xaa 


Xaa 


His 


Ala 
120 


Xaa 


Val 


Ser 


Xaa 


Glu 
125 


Pro 


Gin 


Pro 


Arg 


Cys 
130 


Pro 


Leu 


Gly 


Glu 


Ser 
135 


Arg 


Ser 


Leu 


Gly 


Ala 
140 


Ala 


Val 


Cys 


Arg 


Trp 
145 


Asp 


Ser 


Phe 


Asp 


Phe 
150 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 02 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PANCNOT 0 4 

(B) CLONE: 2084489 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 : 



Met 


Pro 


Pro 


Val 


Ser 
5 


Arg 


Ser 


Ser 


Tyr 


Ser 

10 


Glu 


Asp 


He 


Val 


Gly 
15 


Ser 


Arg 


Arg 


Arg 


Arg 
20 


Arg 


Ser 


Ser 


Ser 


Gly 
25 


Ser 


Pro 


Pro 


Ser 


Pro 
30 


Gin 


Ser 


Arg 


Cys 


Ser 
35 


Ser 


Trp 


Asp 


Gly 


Cys 
40 


Ser 


Arg 


Ser 


His 


Ser 
45 


Arg 


Gly 


Arg 


Glu 


Gly 
50 


Leu 


Arg 


Pro 


Pro 


Trp 
55 


Ser 


Glu 


Leu 


Asp 


Val 
60 


Gly 


Ala 


Leu 


Tyr 


Pro 
65 


Phe 


Ser 


Arg 


Ser 


Gly 
70 


Ser 


Arg 


Gly 


Arg 


Leu 
75 


Pro 


Arg 


Phe 


Arg 


Asn 
80 


Tyr 


Ala 


Phe 


Ala 


Ser 
85 


Ser 


Trp 


Ser 


Thr 


Ser 
90 


Tyr 


Ser 


Gly 


Tyr 


Arg 
95 


Tyr 


His 


Arg 


His 


Cys 
100 


Tyr 


Ala 


Glu 


Glu 


Arg 
105 


Gin 


Ser 


Ala 


Glu 


Asp 
110 


Tyr 


Glu 


Lys 


Glu 


Glu 
115 


Ser 


His 


Arg 


Gin 


Arg 
120 


Arg 


Leu 


Lys 


Glu 


Arg 

125 


Glu 


Arg 


He 


Gly 


Glu 
130 


Leu 


Gly 


Ala 


Pro 


Glu 
135 


Val 


Trp 


Gly 


Pro 


Ser 
140 


Pro 


Lys 


Phe 


Pro 


Gin 
145 


Leu 


Asp 


Ser 


Asp 


Glu 
150 


His 


Thr 


Pro 


Val 


Glu 
155 


Asp 


Glu 


Glu 


Glu 


Val 
160 


Thr 


His 


Gin 


Lys 


Ser 
165 


Ser 


Ser 


Ser 


Asp 


Ser 
170 


Asn 


Ser 


Glu 


Glu 


His 

175 


Arg 


Lys 


Lys 


Lys 


Thr 
180 


Ser 


Arg 


Ser 


Arg 


Asn 
185 


Lys 


Lys 


Lys 


Arg 


Lys 
190 


Asn 


Lys 


Ser 


Ser 


Lys 
195 


Arg 


Lys 


His 


Arg 


Lys 


Tyr 


Ser 


Asp 


Ser 


Asp 


Ser 


Asn 


Ser 


Glu 


Ser 



160 
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200 



Asp 


Thr 


Asn 


Ser 


Asp 


Ser 


Asp 








215 






Lys 


Lys 


Lys 


Lys 


Lys 


Lys 


Lys 










230 






Asn 


Lys 


Lys 


Thr 


Lys 


Lys 


Glu 










245 






Ser 


Glu 


Glu 


Asp 


Leu 


Ser 


Glu 










260 






Val 


Ala 


Asp 


Thr 


Met 


Asp 


Leu 










275 






His 


Thr 


Ser 


Gin 


Asp 


Glu 


Lys 










290 






Leu 


Pro 


Gly 


Glu 


Gly 


Ala 


Ala 










305 






Lys 


Arg 


He 


Pro 


Arg 


Arg 


Gly 








320 






He 


Gly 


Ser 


Phe 


Glu 


Cys 


Ser 










335 






His 


Arg 


Arg 


Met 


Glu 


Ala 


Val 










350 






Tyr 


Ser 


Ala 


Asp 


Glu 


Lys 


Arg 








365 






Glu 


Arg 


Arg 


Lys 


Arg 


Glu 


Ser 










380 






Met 


Val 


His 


Lys 


Lys 


Thr 


Lys 



395 







205 










210 


Asp 


Asp 


Lys 


Lys 


Arg 


Val 


Lys 


Ala 




220 










225 


Lys 


His 


Lys 


Thr 


Lys 


Lys 


Lys 


Lys 




235 










240 


Ser 


Ser 


Asp 
250 


Ser 


Ser 


Cys 


Lys 


Asp 
255 


Ala 


Thr 


Trp 

265 


Met 


Glu 


Gin 


Pro 


Asn 
270 


He 


Gly 


Pro 
280 


Glu 


Ala 


Pro 


He 


He 
285 


Pro 


Leu 


Lys 
295 


Tyr 


Gly 


His 


Ala 


Leu 
300 


Met 


Ala 


Glu 

310 


Tyr 


Val 


Lys 


Ala 


Gly 
315 


Glu 


He 


Gly 
325 


Leu 


Thr 


Ser 


Glu 


Glu 
330 


Gly 


Tyr 


Val 


Met 


Ser 


Gly 


Ser 


Arg 




340 










345 


Arg 


Leu 


Arg 
355 


Lys 


Glu 


Asn 


Gin 


He 
360 


Ala 


Leu 


Ala 
370 


Ser 


Phe 


Asn 


Gin 


Glu 
375 


Lys 


He 


Leu 


Ala 


Ser 


Phe 


Arg 


Glu 




385 










390 


Glu 


Lys 


Asp 
400 


Asp 


Lys 









(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SPLNFET02 

(B) CLONE: 2203226 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 : 



Met 


His 


Pro 


Ala 


Gly 
5 


Leu 


Ala 


Ala 


Ala 


Ala 
10 


Ala 


Gly 


Thr 


Pro 


Arg 
15 


Leu 


Pro 


Ser 


Lys 


Arg 


Arg 


He 


Pro 


Val 


Ser 


Gin 


Pro 


Gly 


Met 


Ala 








20 










25 










30 


Asp 


Pro 


His 


Gin 


Leu 


Phe 


Asp 


Asp 


Thr 


Ser 


Ser 


Ala 


Gin 


Ser 


Arg 








35 










40 










45 


Gly Tyr 


Gly 


Ala 


Gin 


Arg 


Ala 


Pro 


Gly 


Gly 


Leu 


Ser 


Tyr 


Pro 


Ala 










50 










55 










60 


Ala 


Ser 


Pro 


Thr 


Pro 
65 


His 


Ala 


Ala 


Phe 


Leu 
70 


Ala 


Asp 


Pro 


Val 


Ser 
75 


Asn 


Met 


Ala 


Met 


Ala 


Tyr 


Gly 


Ser 


Ser 


Leu 


Ala 


Ala 


Gin 


Gly 


Lys 










80 








85 










90 


Glu 


Leu 


Val 


Asp 


Lys 


Asn 


He 


Asp 


Arg 


Phe 


He 


Pro 


He 


Thr 


Lys 








95 










100 










105 


Leu 


Lys 


Tyr 


Tyr 


Phe 


Ala 


Val 


Asp 


Thr 


Met 


Tyr 


Val 


Gly 


Arg 


Lys 




110 










115 










120 



161 




PF-0459 US 



Leu 


Gly 


Leu 


Leu 


Phe 


Phe 


Pro 


Tyr 


Leu 


His 


Gin 


Asp 


Trp 


Glu 


Val 








125 










130 










135 


Gin 


Tyr 


Gin 


Gin 


Asp 


Thr 


Pro 


Val 


Ala 


Pro 


Arg 


Phe 


Asp 


Val 


Asn 








140 










145 










150 


Ala 


Pro 


Asp 


Leu 


Tyr 


He 


Pro 


Ala 


Met 


Ala 


Phe 


He 


Thr 


Tyr 


Val 








155 










160 










165 


Leu 


Val 


Ala 


Gly 


Leu 


Ala 


Leu 


Gly 


Thr 


Gin 


Asp 


Arg 


Phe 


Ser 


Pro 








170 










175 










180 


Asp 


Leu 


Leu 


Gly 


Leu 


Gin 


Ala 


Ser 


Ser 


Ala 


Leu 


Ala 


Trp 


Leu 


Thr 






185 










190 










195 


Leu 


Glu 


Val 


Leu 


Ala 


He 


Leu 


Leu 


Ser 


Leu 


Tyr 


Leu 


Val 


Thr 


Val 










200 










205 










210 


Asn 


Thr 


Asp 


Leu 


Thr 


Thr 


He 


Asp 


Leu 


Val 


Ala 


Phe 


Leu 


Gly 


Tyr 








215 










220 










225 


Lys 


Tyr 


Val 


Gly 


Met 


He 


Gly 


Gly 


Val 


Leu 


Met 


Gly 


Leu 


Leu 


Phe 






230 










235 










240 


Gly 


Lys 


He 


Gly 


Tyr 


Tyr 


Leu 


Val 


Leu 


Gly 


Trp 


Cys 


Cys 


Val 


Ala 






245 










250 










255 


He 


Phe 


Val 


Phe 


Met 


He 


Arg 


Thr 


Leu 


Arg 


Leu 


Lys 


He 


Leu 


Ala 










260 










265 










270 


Asp 


Ala 


Ala 


Ala 


Glu 


Gly 


Val 


Pro 


Val 


Arg 


Gly 


Ala 


Arg 


Asn 


Gin 








275 










280 










285 


Leu 


Arg 


Met 


Tyr 


Leu 


Thr 


Met 


Ala 


Val 


Ala 


Ala 


Ala 


Gin 


Pro 


Met 






290 










295 










o n n 


Leu 


Met 


Tyr 


Trp 


Leu 


Thr 


Phe 


His 


Leu 


Val 


Arg 
















305 










310 












(2) 


INFORMATION 


FOR 


SEQ 


ID 


NO: 




49: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 316 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT1 6 

(B) CLONE: 2232884 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 9 : 



Met 


Ala 


Ser 


Ala 


Asp 
5 


Glu 


Leu 


Thr 


Phe 


His 
10 


Glu 


Phe 


Glu 


Glu 


Ala 
15 


Thr 


Asn 


Leu 


Leu 


Ala 


Asp 


Thr 


Pro 


Asp 


Ala 


Ala 


Thr 


Thr 


Ser 


Arg 










20 








25 










30 


Ser 


Asp 


Gin 


Leu 


Thr 


Pro 


Gin 


Gly 


His 


Val 


Ala 


Val 


Ala 


Val 


Gly 








35 










40 










45 


Ser 


Gly 


Gly 


Ser 


Tyr 


Gly 


Ala 


Glu 


Asp 


Glu 


Val 


Glu 


Glu 


Glu 


Ser 






50 










55 










60 


Asp 


Lys 


Ala 


Ala 


Leu 


Leu 


Gin 


Glu 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Pro 






65 










70 










75 


Gly 


Phe 


Trp 


Thr 


Phe 


Ser 


Tyr 


Tyr 


Gin 


Ser 


Phe 


Phe 


Asp 


Val 


Asp 






80 










85 










90 


Thr 


Ser 


Gin 


Val 


Leu 
95 


Asp 


Arg 


He 


Lys 


Gly 

100 


Ser 


Leu 


Leu 


Pro 


Arg 
105 


Pro 


Gly 


His 


Asn 


Phe 


Val 


Arg 


His 


His 


Leu 


Arg 


Asn 


Arg 


Pro 


Asp 








110 










115 










120 


Leu 


Tyr 


Gly 


Pro 


Phe 


Trp 


He 


Cys 


Ala 


Thr 


Leu 


Ala 


Phe 


Val 


Leu 



162 
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125 










130 












Ala 


Val 


Thr 


Gly 


Asn 


Leu 


Thr 


Leu 


Val 


Leu 


Ala 


Gin 


Arg 


Arg 


Asp 








140 










145 










loU 


Pro 


Ser 


He 


His 


Tyr 
155 


Ser 


Pro 


Gin 


Phe 


His 
160 


Lys 


Val 


Thr 


Val 


Ala 
165 


Gly 


He 


Ser 


He 


Tyr 


Cys 


Tyr 


Ala 


Trp 


Leu 


Val 


Pro 


Leu 


Ala 


Leu 








170 










175 










i on 
loU 


Trp 


Gly 


Phe 


Leu 


Arg 


Trp 


Arg 


Lys 


Gly 


Val 


Gin 


Glu 


Arg 


Met 


Gly 






185 










190 










"IOC 

i y d 


Pro 


Tyr 


Thr 


Phe 


Leu 


Glu 


Thr 


Val 


Cys 


He 


Tyr 


Gly 


Tyr 


Ser 


Leu 








200 










205 










ZlO 


Phe 


Val 


Phe 


He 


Pro 
215 


Met 


Val 


Val 


Leu 


Trp 
220 


Leu 


He 


Pro 


Val 


Pro 
225 


Trp 


Leu 


Gin 


Trp 


Leu 


Phe 


Gly 


Ala 


Leu 


Ala 


Leu 


Gly 


Leu 


Ser 


Ala 






230 










235 










240 


Ala 


Gly 


Leu 


Val 


Phe 


Thr 


Leu 


Trp 


Pro 


Val 


Val 


Arg 


Glu 


Asp 


Thr 








245 










250 










255 


Arg 


Leu 


Val 


Ala 


Thr 


Val 


Leu 


Leu 


Ser 


Val 


Val 


Val 


Leu 


Leu 


His 








260 










265 










270 


Ala 


Leu 


Leu 


Ala 


Met 


Gly 


Cys 


Lys 


Leu 


Tyr 


Phe 


Phe 


Gin 


Ser 


Leu 










275 








280 










285 


Pro 


Pro 


Glu 


Asn 


Val 
290 


Ala 


Pro 


Pro 


Pro 


Gin 
295 


He 


Thr 


Ser 


Leu 


Pro 
300 


Ser 


Asn 


He 


Ala 


Leu 
305 


Ser 


Pro 


Thr 


Leu 


Pro 
310 


Gin 


Ser 


Leu 


Ala 


Pro 
315 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNN0T11 

(B) CLONE: 2328134 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 : 



Met 


Thr 


Pro 


Arg 


Thr 


Trp 


Trp 


Pro 


Arg 


Pro 


Ala 


Gly 


Trp 


Gly 


Thr 








5 










10 










15 


Cys 


Arg 


Ala 


Ala 


Gly 


Trp 


Pro 


Arg 


Ser 


Val 


Pro 


Trp 


Ala 


Arg 


Thr 






20 










25 










30 


Ala 


Ala 


Ser 


Leu 


Val 

35 


Phe 


Val 


Pro 


Thr 


Arg 
40 


Arg 


Arg 


Ser 


Gly 


Pro 
45 


Ser 


Gly 


Thr 


Ala 


Ser 


Val 


Ala 


Ala 


Met 


Ala 


Tyr 


His 


Ser 


Gly 


Tyr 








50 










55 










60 


Gly Ala 


His 


Gly 


Ser 


Lys 


His 


Arg 


Ala 


Arg 


Ala 


Ala 


Pro 


Asp 


Pro 










65 










70 










75 


Pro 


Pro 


Leu 


Phe 


Asp 


Asp 


Thr 


Ser 


Gly 


Gly 


Tyr 


Ser 


Ser 


Gin 


Pro 










80 








85 










90 


Gly 


Gly 


Tyr 


Pro 


Ala 


Thr 


Gly 


Ala 


Asp 


Val 


Ala 


Phe 


Ser 


Val 


Asn 




95 










100 










105 


His 


Leu 


Leu 


Gly 


Asp 


Pro 


Met 


Ala 


Asn 


Val 


Ala 


Met 


Ala 


Tyr 


Gly 








110 










115 










120 


Ser 


Ser 


He 


Ala 


Ser 
125 


His 


Gly 


Lys 


Asp 


Met 
130 


Val 


His 


Lys 


Glu 


Leu 
135 



163 
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His 


Arg 


Phe 


Val 


Ser 


Val 


Ser 


Lys 


Leu 


Lys 


Tyr 


Phe 


Phe 


Ala 


Val 








140 










145 










150 


Asp 


Thr 


Ala 


Tyr 


Val 


Ala 


Lys 


Lys 


Leu 


Gly 


Leu 


Leu 


Val 


Phe 


Pro 






155 










160 










1 65 


Tyr 


Thr 


His 


Gin 


Asn 


Trp 


Glu 


Val 


Gin 


Tyr 


Ser 


Arg 


Asp 


Ala 


Pro 








170 










175 










180 


Leu 


Pro 


Pro 


Arg 


Gin 


Asp 


Leu 


Asn 


Ala 


Pro 


Asp 


Leu 


Tyr 


He 


Pro 








185 










190 










195 


Thr 


Met 


Ala 


Phe 


He 
200 


Thr 


Tyr 


Val 


Leu 


Leu 
205 


Ala 


Gly 


Met 


Ala 


Leu 
210 


Gly 


He 


Gin 


Lys 


Arg 


Phe 


Ser 


Pro 


Glu 


Val 


Leu 


Gly 


Leu 


Cys 


Ala 






215 










220 










225 


Ser 


Thr 


Ala 


Leu 


Val 

230 


Trp 


Val 


Val 


Met 


Glu 
235 


Val 


Leu 


Ala 


Leu 


Leu 
240 


Leu 


Gly 


Leu 


Tyr 


Leu 


Ala 


Thr 


Val 


Arg 


Ser 


Asp 


Leu 


Ser 


Thr 


Phe 






245 










250 










255 


His 


Leu 


Leu 


Ala 


Tyr 

260 


Ser 


Gly 


Tyr 


Lys 


Tyr 
265 


Val 


Gly 


Met 


He 


Leu 
270 


Ser 


Val 


Leu 


Thr 


Gly 
275 


Leu 


Leu 


Phe 


Gly 


Ser 
280 


Asp 


Gly 


Tyr 


Tyr 


Val 
285 


Ala 


Leu 


Ala 


Trp 


Thr 


Ser 


Ser 


Ala 


Leu 


Met 


Tyr 


Phe 


He 


Val 


Arg 








290 










295 










300 


Ser 


Leu 


Arg 


Thr 


Ala 


Ala 


Leu 


Gly 


Pro 


Asp 


Ser 


Met 


Gly 


Gly 


Pro 








305 










310 










315 


Val 


Pro 


Arg 


Gin 


Arg 


Leu 


Gin 


Leu 


Tyr 


Leu 


Thr 


Leu 


Gly 


Ala 


Ala 








320 










325 










330 


Ala 


Phe 


Gin 


Pro 


Leu 
335 


He 


He 


Tyr 


Trp 


Leu 
340 


Thr 


Phe 


His 


Leu 


Val 
345 



Arg 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ISLTNOT01 

(B) CLONE: 2382718 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 51 : 








Met 


Gly 


Thr 


Lys 


Ala 
5 


Gin 


Val 


Glu 


Arg 


Lys 
10 


Leu 


Leu 


Cys 


Leu 


Phe 
15 


He 


Leu 


Ala 


He 


Leu 
20 


Leu 


Cys 


Ser 


Leu 


Ala 

25 


Leu 


Gly 


Ser 


Val 


Thr 
30 


Val 


His 


Ser 


Ser 


Glu 
35 


Pro 


Glu 


Val 


Arg 


He 
40 


Pro 


Glu 


Asn 


Asn 


Pro 
45 


Val 


Lys 


Leu 


Ser 


Cys 
50 


Ala 


Tyr 


Ser 


Gly 


Phe 
55 


Ser 


Ser 


Pro 


Arg 


Val 
60 


Glu 




Lys 


Phe 


Asp 
65 


Gin 


Gly 


Asp 


Thr 


Thr 
70 


Arg 


Leu 


Val 


Cys 


Tyr 
75 


Asn 


Asn 


Lys 


He 


Thr 
80 


Ala 


Ser 


Tyr 


Glu 


Asp 
85 


Arg 


Val 


Thr 


Phe 


Leu 
90 


Pro 


Thr 


Gly 


He 


Thr 
95 


Phe 


Lys 


Ser 


Val 


Thr 
100 


Arg 


Glu 


Asp 


Thr 


Gly 
105 


Thr 


Tyr 


Thr 


Cys 


Met 


Val 


Ser 


Glu 


Glu 


Gly 


Gly 


Asn 


Ser 


Tyr 


Gly 



164 
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110 










115 










120 


Glu 


Val 


Lys 


Val 


Lys 
125 


Leu 


He 


Val 


Leu 


Val 

130 


Pro 


Pro 


Ser 


Lys 


Pro 
135 


Thr 


Val 


Asn 


He 


Pro 
140 


Ser 


Ser 


Ala 


Thr 


He 
145 


Gly 


Asn 


Arg 


Ala 


Val 
150 


Leu 


Thr 


Cys 


Ser 


Glu 
155 


Gin 


Asp 


Gly 


Ser 


Pro 
160 


Pro 


Ser 


Glu 


Tyr 


Thr 
165 


Trp 


Phe 


Lys 


Asp 


Gly 
170 


He 


Val 


Met 


Pro 


Thr 
175 


Asn 


Pro 


Lys 


Ser 


Thr 

180 


Arg 


Ala 


Phe 


Ser 


Asn 
185 


Ser 


Ser 


Tyr 


Val 


Leu 
190 


Asn 


Pro 


Thr 


Thr 


Gly 
195 


Glu 


Leu 


Val 


Phe 


Asp 
200 


Pro 


Leu 


Ser 


Ala 


Ser 
205 


Asp 


Thr 


Gly 


Glu 


Tyr 

210 


Ser 


Cys 


Glu 


Ala 


Arg 
215 


Asn 


Gly 


Tyr 


Gly 


Thr 

220 


Pro 


Met 


Thr 


Ser 


Asn 
225 


Ala 


Val 


Arg 


Met 


Glu 
230 


Ala 


Val 


Glu 


Arg 


Asn 
235 


Val 


Gly 


Val 


He 


Val 
240 


Ala 


Ala 


Val 


Leu 


Val 
245 


Thr 


Leu 


He 


Leu 


Leu 
250 


Gly 


He 


Leu 


Val 


Phe 
255 


Gly 


He 


Trp 


Phe 


Ala 

260 


Tyr 


Ser 


Arg 


Gly 


His 
265 


Phe 


Asp 


Arg 


Thr 


Lys 
270 


Lys 


Gly 


Thr 


Ser 


Ser 
275 


Lys 


Lys 


Val 


He 


Tyr 
280 


Ser 


Gin 


Pro 


Ser 


Ala 
285 


Arg 


Ser 


Glu 


Gly 


Glu 
290 


Phe 


Lys 


Gin 


Thr 


Ser 
295 


Ser 


Phe 


Leu 


Val 





(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ENDANOT01 

(B) CLONE: 2452208 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 : 



Met 


Ala 


Ser 


Thr 


Gly 

5 


Ser 


Gin 


Ala 


Ser 


Asp 
10 


He 


Asp 


Glu 


He 


Phe 
15 


Gly 


Phe 


Phe 


Asn 


Asp 

20 


Gly 


Glu 


Pro 


Pro 


Thr 
25 


Lys 


Lys 


Pro 


Arg 


Lys 
30 


Leu 


Leu 


Pro 


Ser 


Leu 
35 


Lys 


Thr 


Lys 


Lys 


Pro 
40 


Arg 


Glu 


Leu 


Val 


Leu 
45 


Val 


He 


Gly 


Thr 


Gly 
50 


He 


Ser 


Ala 


Ala 


Val 
55 


Ala 


Pro 


Gin 


Val 


Pro 
60 


Ala 


Leu 


Lys 


Ser 


Trp 
65 


Lys 


Gly 


Leu 


He 


Gin 
70 


Ala 


Leu 


Leu 


Asp 


Ala 
75 


Ala 


He 


Asp 


Phe 


Asp 
80 


Leu 


Leu 


Glu 


Asp 


Glu 
85 


Glu 


Ser 


Lys 


Lys 


Phe 
90 


Gin 


Lys 


c Y s 


Leu 


His 
95 


Glu 


Asp 


Lys 


Asn 


Leu 
100 


Val 


His 


Val 


Ala 


His 
105 


Asp 


Leu 


He 


Gin 


Lys 
110 


Leu 


Ser 


Pro 


Arg 


Thr 
115 


Ser 


Asn 


Val 


Arg 


Ser 

120 


Thr 


Phe 


Phe 


Lys 


Asp 
125 


Cys 


Leu 


Tyr 


Glu 


Val 
130 


Phe 


Asp 


Asp 


Leu 


Glu 
135 



165 





PF-0459 


US 


























Ser 


Lys 


Met 


Glu 


Asp 


Ser 


Gly 


Lys 


Gin 


Leu 


Leu 


Gin 


Ser 


Val 


Leu 








140 










145 










150 


His 


Leu 


Met 


Glu 


Asn 


Gly Ala 


Leu 


Val 


Leu 


Thr 


Thr 


Asn 


Phe 


Asp 










155 










160 










165 


Asn 


Leu 


Leu 


Glu 


Leu 


Tyr 


Ala 


Ala 


Asp 


Gin 


Gly 


Lys 


Gin 


Leu 


Glu 










170 










175 










180 


Ser 


Leu 


Asp 


Leu 


Thr 


Asp 


Glu 


Lys 


Lys 


Val 


Leu 


Glu 


Trp 


Ala 


Gin 








185 










190 










195 


Glu 


Lys 


Arg 


Lys 


Leu 


Ser 


Val 


Leu 


His 


He 


His 


Gly 


Val 


Tyr 


Thr 








200 










205 










210 


Asn 


Pro 


Ser 


Gly 


He 


Val 


Leu 


His 


Pro 


Ala 


Gly 


Tyr 


Gin 


Asn 


Val 








215 










220 










225 


Leu 


Arg 


Asn 


Thr 


Glu 


Val 


Met 


Arg 


Glu 


lie 


Gin 


Lys 


Leu 


Tyr 


Glu 








230 










235 










240 


Asn 


Lys 


Ser 


Phe 


Leu 


Phe 


Leu 


Gly 


Cys 


Gly 


Trp 


Thr 


Val 


Asp 


Asp 








245 










250 










255 


Thr 


Thr 


Phe 


Gin 


Ala 


Leu 


Phe 


Leu 


Glu 


Ala 


Val 


Lys 


His 


Lys 


Ser 










260 










265 










270 


Asp 


Leu 


Glu 


His 


Phe 


Met 


Leu 


Val 


Arg 


Arg 


Gly Asp 


Val 


Asp 


Glu 








275 










280 










285 


rne 


Lys 


Lys 


Leu 


Arg 


Glu 


Asn 


Met 


Leu 


Asp 


Lys 


Gly 


He 


Lys 


Val 






290 










295 










300 


He 


Ser 


Tyr 


Gly 


Asp 


Asp 


Tyr 


Ala 


Asp 


Leu 


Pro 


Glu 


Tyr 


Phe 


Lys 








305 










310 










315 


Arg 


Leu 


Thr 


Cys 


Glu 


He 


Ser 


Thr 


Arg 


Gly 


Thr 


Ser 


Ala 


Gly 


Met 






320 










325 










330 


Val 


Arg 


Glu 


Gly 


Gin 


Leu 


Asn 


Gly 


Ser 


Ser 


Ala 


Ala 


His 


Ser 


Glu 








335 










340 










345 


He 


Arg 


Gly 


Cys 


Ser 


Thr 





















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 662 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ENDANOT01 

(B) CLONE: 2457825 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 : 



Met 


Thr 


Ala 


Lys 


Lys 
5 


Gin 


Cys 


Leu 


Leu 


Arg 
10 


Leu 


Gly 


Val 


Leu 


Arg 
15 


Gin 


Asp 


Trp 


Pro 


Asp 

20 


Thr 


Asn 


Arg 


Leu 


Leu 
25 


Gly 


Ser 


Ala 


Asn 


Val 
30 


Val 


Pro 


Glu 


Ala 


Leu 

35 


Gin 


Arg 


Phe 


Thr 


Arg 
40 


Ala 


Ala 


Ala 


Asp 


Phe 
45 


Ala 


Thr 


His 


Gly 


Lys 
50 


Leu 


Gly 


Lys 


Leu 


Glu 
55 


Phe 


Ala 


Gin 


Asp 


Ala 
60 


His 


Gly 


Gin 


Pro 


Asp 
65 


Val 


Ser 


Ala 


Phe 


Asp 
70 


Phe 


Thr 


Ser 


Met 


Met 
75 


Arg 


Ala 


Glu 


Ser 


Ser 
80 


Ala 


Arg 


Val 


Gin 


Glu 
85 


Lys 


His 


Gly Ala Arg 
90 


Leu 


Leu 


Leu 


Gly 


Leu 


Val 


Gly 


Asp 


Cys 


Leu 


Val 


Glu 


Pro 


Phe 


Trp 



350 



(2) INFORMATION 



FOR SEQ ID NO: 



53: 
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95 100 105 

Pro Leu Gly Thr Gly Val Ala Arg Gly Phe Leu Ala Ala Phe Asp 
110 H5 120 

Ala Ala Trp Met Val Lys Arg Trp Ala Glu Gly Ala Glu Ser Leu 
125 130 135 

Glu Val Leu Ala Glu Arg Glu Ser Leu Tyr Gin Leu Leu Ser Gin 
140 145 150 

Thr Ser Pro Glu Asn Met His Arg Asn Val Ala Gin Tyr Gly Leu 
155 160 165 

Asp Pro Ala Thr Arg Tyr Pro Asn Leu Asn Leu Arg Ala Val Thr 
170 175 180 

Pro Asn Gin Val Arg Asp Leu Tyr Asp Val Leu Ala Lys Glu Pro 
185 190 195 

Val Gin Arg Asp Asn Asp Lys Thr Asp Thr Gly Met Pro Ala Thr 
200 205 210 

Gly Ser Ala Gly Thr Gin Glu Glu Leu Leu Arg Trp Cys Gin Glu 
215 220 225 

Gin Thr Ala Gly Tyr Pro Gly Val His Val Ser Asp Leu Ser Ser 
230 235 240 

Ser Trp Ala Asp Gly Leu Ala Leu Cys Ala Leu Val Tyr Arg Leu 
245 250 255 

Gin Pro Gly Leu Leu Glu Pro Ser Glu Leu Gin Gly Leu Gly Ala 
260 265 270 

Leu Glu Ala Thr Ala Trp Ala Leu Lys Val Ala Glu Asn Glu Leu 
275 280 285 

Gly lie Thr Pro Val Val Ser Ala Gin Ala Val Val Ala Gly Ser 
290 295 300 

Asp Pro Leu Gly Leu lie Ala Tyr Leu Ser His Phe His Ser Ala 
305 310 315 

Phe Lys Ser Met Ala His Ser Pro Gly Pro Val Ser Gin Ala Ser 
320 325 330 

Pro Gly Thr Ser Ser Ala Val Leu Phe Leu Ser Lys Leu Gin Arg 
335 340 345 

Thr Leu Gin Arg Ser Arg Ala Lys Glu Asn Ala Glu Asp Ala Gly 
350 355 360 

Gly Lys Lys Leu Arg Leu Glu Met Glu Ala Glu Thr Pro Ser Thr 
365 370 375 

Glu Val Pro Pro Asp Pro Glu Pro Gly Val Pro Leu Thr Pro Pro 
380 385 390 

Ser Gin His Gin Glu Ala Gly Ala Gly Asp Leu Cys Ala Leu Cys 
395 400 405 

Gly Glu His Leu Tyr Val Leu Glu Arg Leu Cys Val Asn Gly His 
410 415 420 

Phe Phe His Arg Ser Cys Phe Arg Cys His Thr Cys Glu Ala Thr 
425 430 435 

Leu Trp Pro Gly Gly Tyr Glu Gin His Pro Gly Ser Arg Thr Ser 
440 445 450 

Gin Phe Phe Phe Ser Ala Leu Val Ala Met Glu Lys Glu Glu Lys 
455 460 465 

Glu Ser Pro Phe Ser Ser Glu Glu Glu Glu Glu Asp Val Pro Leu 
470 475 480 

Asp Ser Asp Val Glu Gin Ala Leu Gin Thr Phe Ala Lys Thr Ser 
485 490 495 

Gly Thr Met Asn Asn Tyr Pro Thr Trp Arg Arg Thr Leu Leu Arg 
500 505 510 

Arg Ala Lys Glu Glu Glu Met Lys Arg Phe Cys Lys Ala Gin Thr 
515 520 525 

lie Gin Arg Arg Leu Asn Glu lie Glu Ala Ala Leu Arg Glu Leu 
530 535 540 

Glu Ala Glu Gly Val Lys Leu Glu Leu Ala Leu Arg Arg Gin Ser 
545 550 555 
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Ser 


Ser 


Pro 


Glu 


Gin 
560 


Gin 


Lys 


Lys 


Leu 


Trp 
565 


Val 


Gly 


Gin 


Leu 


Leu 
570 


Gin 


Leu 


Val 


Asp 


Lys 
575 


Lys 


Asn 


Ser 


Leu 


Val 
580 


Ala 


Glu 


Glu 


Ala 


Glu 
585 


Leu 


Met 


lie 


Thr 


Val 
590 


Gin 


Glu 


Leu 


Asn 


Leu 
595 


Glu 


Glu 


Lys 


Gin 


Trp 
600 


Gin 


Leu 


Asp 


Gin 


Glu 


Leu 


Arg 


Gly 


Tyr 


Met 


Asn 


Arg 


Glu 


Glu 


Asn 








605 










610 










615 


Leu 


Lys 


Thr 


Ala 


Ala 


Asp 


Arg 


Gin 


Ala 


Glu 


Asp 


Gin 


Val 


Leu 


Arg 








620 










625 










630 


Lys 


Leu 


Val 


Asp 


Leu 
635 


Val 


Asn 


Gin 


Arg 


Asp 
640 


Ala 


Leu 


He 


Arg 


Phe 
645 


Gin 


Glu 


Glu 


Arg 


Arg 


Leu 


Ser 


Glu 


Leu 


Ala 


Leu 


Gly 


Thr 


Gly Ala 










650 










655 










660 


Gin 


Gly 





























(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP1NOT03 

(B) CLONE: 2470740 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 54 : 








Met 


Ala 


Ser 


Trp 


Pro 


Ala 


Ser 


Pro 


Leu 


Gin 


Trp 


Gly 


Pro 


Pro 


Leu 








5 










10 










15 


Ala 


Ser 


Cys 


Pro 


Ser 


Cys 


Cys 


Cys 


Cys 


Cys 


Phe 


His 


Cys 


Trp 


Gin 










20 










25 










30 


Pro 


Arg 


Val 


Gly 


Val 


Ala 


Cys 


Arg 


Gin 


Arg 


Cys 


Trp 


Pro 


Leu 


Arg 










35 










40 










45 


Trp 


Gly 


Trp 


Trp 


Val 


Trp 


Gly 


Pro 


Pro 


Thr 


Cys 


Ser 


Phe 


Val 


Gin 










50 










55 










60 


Pro 


Cys 


Thr 


Cys 


Pro 


Pro 


Val 


Phe 


Ser 


Tyr 


Ser 


Trp 


Pro 


Arg 


Val 








65 










70 










75 


Pro 


His 


Trp 


Gly 


Pro 


Ser 


Trp 


Xaa 


Met 


Ser 


Trp 


Arg 


Arg 


Arg 


Leu 










80 










85 










90 


Met 


Gly 


Val 


Pro 


Leu 


Gly 


Leu 


Trp 


Asn 


Cys 


Leu 


Val 


Leu 


Lys 


Leu 










95 










100 










105 


Xaa 


Gin 


Gly 


Leu 


Ala 


Pro 


Thr 


Ser 


Gly 


Gly 




















110 










115 













(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 157 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 
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(A) LIBRARY: SMCANOT01 

(B) CLONE: 2479092 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 : 



Met 


Glu 


Ala 


Leu 


Arg 
5 


Arg 


Ala 


His 


Glu 


Val 
10 


Ala 


Leu 


Arg 


Leu 


Leu 
15 


Leu 


Cys 


Arg 


Pro 


Trp 
20 


Ala 


Ser 


Arg 


Ala 


Ala 
25 


Ala 


Arg 


Pro 


Lys 


Pro 

30 


Ser 


Ala 


Ser 


Glu 


Val 
35 


Leu 


Thr 


Arg 


His 


Leu 
40 


Leu 


Gin 


Arg 


Arg 


Leu 
45 


Pro 


His 


Trp 


Thr 


Ser 
50 


Phe 


Cys 


Val 


Pro 


Tyr 
55 


Ser 


Ala 


Val 


Arg 


Asn 
60 


Asp 


Gin 


Phe 


Gly 


Leu 
65 


Ser 


His 


Phe 


Asn 


Trp 
70 


Pro 


Val 


Gin 


Gly 


Ala 
75 


Asn 


Tyr 


His 


Val 


Leu 
80 


Arg 


Thr 


Gly 


Cys 


Phe 
85 


Pro 


Phe 


He 


i«ys 


Tyr 
90 


His 


Cys 


Ser 


Lys 


Ala 
95 


Pro 


Trp 


Gin 


Asp 


Leu 
100 


Ala 


Arg 


Gin 


Asn 


Arg 
105 


Phe 


Phe 


Thr 


Ala 


Leu 
110 


Lys 


Val 


Val 


Asn 


Leu 
115 


Gly 


He 


Pro 


Thr 


Leu 
120 


Leu 


Tyr 


Gly 


Leu 


Gly 
125 


Ser 


Trp 


Leu 


Phe 


Ala 
130 


Arg 


Val 


Thr 


Glu 


Thr 
135 


Val 


His 


Thr 


Ser 


Tyr 
140 


Gly 


Pro 


He 


Thr 


Val 
145 


Tyr 


Phe 


Leu 


Asn 


Lys 
150 


Glu 


Asp 


Glu 


Gly 


Ala 
155 


Met 


Tyr 



















(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 197 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SMCANOT01 

(B) CLONE: 2480544 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 : 



Met 


Pro 


Pro 


Ala 


Gly 
5 


Leu 


Arg 


Arg 


Ala 


Ala 
10 


Pro 


Leu 


Thr 


Ala 


He 
15 


Ala 


Leu 


Leu 


Val 


Leu 
20 


Gly 


Ala 


Pro 


Leu 


Val 
25 


Leu 


Ala 


Gly 


Glu 


Asp 
30 


Cys 


Leu 


Trp 


Tyr 


Leu 


Asp 


Arg 


Asn 


Gly 


Ser 


Trp 


His 


Pro 


Gly 


Phe 








35 










40 










45 


Asn 


Cys 


Glu 


Phe 


Phe 


Thr 


Phe 


Cys 


Cys 


Gly 


Thr 


Cys 


Tyr 


His 


Arg 








50 










55 










60 


Tyr 


Cys 


Cys 


Arg 


Asp 


Leu 


Thr 


Leu 


Leu 


He 


Thr 


Glu 


Arg 


Gin 


Gin 






65 










70 










75 


Lys 


His 


Cys 


Leu 


Ala 


Phe 


Ser 


Pro 


Lys 


Thr 


He 


Ala 


Gly 


He 


Ala 






80 










85 










90 


Ser 


Ala 


Val 


He 


Leu 
95 


Phe 


Val 


Ala 


Val 


Val 
100 


Ala 


Thr 


Thr 


He 


Cys 
105 


Cys 


Phe 


Leu 


Cys 


Ser 


Cys 


Cys 


Tyr 


Leu 


Tyr 


Arg 


Arg 


Arg 


Gin 


Gin 








110 










115 










120 



169 



PF-0459 US 



Leu 


Gin 


Ser 


Pro 


Phe 


Glu 


Gly 


Gin 


Glu 


He 


Pro 


Met 


Thr 


Gly 


He 










125 










130 










135 


Pro 


Val 


Gin 


Pro 


Val 


Tyr 


Pro 


Tyr 


Pro 


Gin 


Asp 


Pro 


Lys 


Ala 


Gly 










140 










145 










150 


Pro 


Ala 


Pro 


Pro 


Gin 


Pro 


Gly 


Phe 


Met 


Tyr 


Pro 


Pro 


Ser 


Gly 


Pro 










155 










160 










165 


Ala 


Pro 


Gin 


Tyr 


Pro 


Leu 


Tyr 


Pro 


Ala 


Gly 


Pro 


Pro 


Val 


Tyr 


Asn 










170 










175 










180 


Pro 


Ala 


Ala 


Pro 


Pro 


Pro 


Tyr 


Met 


Pro 


Pro 


Gin 


Pro 


Ser 


Tyr 


Pro 



185 190 ■ 195 

Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 245 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT21 

(B) CLONE : 2518547 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 : 



Met 


Gly 


Gly 


Ala 


Ser 


Arg 


Arg 


Val 


Glu 


Ser 


Gly 


Ala 


Trp 


Ala 


Tyr 








5 










10 










15 


Leu 


Ser 


Pro 


Leu 


Val 
20 


Leu 


Arg 


Lys 


Glu 


Leu 
25 


Glu 


Ser 


Leu 


Val 


Glu 
30 


Asn 


Glu 


Gly 


Ser 


Glu 


Val 


Leu 


Ala 


Leu 


Pro 


Glu 


Leu 


Pro 


Ser 


Ala 








35 










40 










45 


His 


Pro 


He 


He 


Phe 
50 


Trp 


Asn 


Leu 


Leu 


Trp 
55 


Tyr 


Phe 


Gin 


Arg 


Leu 
60 


Arg 


Leu 


Pro 


Ser 


He 


Leu 


Pro 


Gly 


Leu 


Val 


Leu 


Ala 


Ser 


Cys 


Asp 








65 










70 










75 


Gly 


Pro 


Ser 


His 


Ser 


Gin 


Ala 


Pro 


Ser 


Pro 


Trp 


Leu 


Thr 


Pro 


Asp 








80 










85 










90 


Pro 


Ala 


Ser 


Val 


Gin 
95 


Val 


Arg 


Leu 


Leu 


Trp 
100 


Asp 


Val 


Leu 


Thr 


Pro 
105 


Asp 


Pro 


Asn 


Ser 


Cys 


Pro 


Pro 


Leu 


Tyr 


Val 


Leu 


Trp 


Arg 


Val 


His 








110 










115 










120 


Ser 


Gin 


He 


Pro 


Gin 

125 


Arg 


Val 


Val 


Trp 


Pro 
130 


Gly 


Pro 


Val 


Pro 


Ala 
135 


Ser 


Leu 


Ser 


Leu 


Ala 

140 


Leu 


Leu 


Glu 


Ser 


Val 
145 


Leu 


Arg 


His 


Val 


Gly 
150 


Leu 


Asn 


Glu 


Val 


His 
155 


Lys 


Ala 


Val 


Gly 


Leu 
160 


Leu 


Leu 


Glu 


Thr 


Leu 
165 


Gly 


Pro 


Pro 


Pro 


Thr 


Gly 


Leu 


His 


Leu 


Gin 


Arg 


Gly 


He 


Tyr 


Arg 








170 










175 










180 


Glu 


He 


Leu 


Phe 


Leu 
185 


Thr 


Met 


Ala 


Ala 


Leu 
190 


Gly 


Lys 


Asp 


His 


Val 
195 


Asp 


He 


Val 


Ala 


Phe 
200 


Asp 


Lys 


Lys 


Tyr 


Lys 
205 


Ser 


Ala 


Phe 


Asn 


Lys 
210 


Leu 


Ala 


Ser 


Ser 


Met 
215 


Gly 


Lys 


Glu 


Glu 


Leu 
220 


Arg 


His 


Arg 


Arg 


Ala 
225 


Gin 


Met 


Pro 


Thr 


Pro 
230 


Lys 


Ala 


He 


Asp 


Cys 
235 


Arg 


Lys 


Cys 


Phe 


Gly 
240 
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Ala Pro Pro Glu Cys 
245 



(2) INFORMATION FOR SEQ ID NO: 



58 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GBLANOT02 

(B) CLONE: 2530650 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 : 



Met 


Leu 


Leu 


Pro 


Gin 


Leu 


Cys 


Trp 


Leu 


Pro 


Leu 


Leu 


Ala 


Gly 


Leu 










5 








10 










15 


Leu 


Pro 


Pro 


Val 


Pro 
20 


Ala 


Gin 


Lys 


Phe 


Ser 
25 


Ala 


Leu 


Thr 


Phe 


Leu 
30 


Arg 


Val 


Asp 


Gin 


Asp 


Lys 


Asp 


Lys 


Asp 


Cys 


Ser 


Leu 


Asp 


Cys 


Ala 






35 










40 










45 


Gly 


Ser 


Pro 


Gin 


Lys 


Pro 


Leu 


Cys 


Ala 


Ser 


Asp 


Gly 


Arg 


Thr 


Phe 








50 










55 










60 


Leu 


Ser 


Arg 


Cys 


Glu 


Phe 


Gin 


Arg 


Ala 


Lys 


Cys 


Lys 


Asp 


Pro 


Gin 








65 










70 










75 


Leu 


Glu 


He 


Ala 


Tyr 


Arg 


Gly Asn 


Cys 


Lys 


Asp 


Val 


Ser 


Arg 


Cys 










80 










85 










90 


Val 


Ala 


Glu 


Arg 


Lys 


Tyr 


Thr 


Gin 


Glu 


Gin 


Ala 


Arg 


Lys 


Glu 


Phe 








95 










100 










105 


Gin 


Gin 


Val 


Phe 


He 
110 


Pro 


Glu 


Cys 


Asn 


Asp 
115 


Asp 


Gly 


Thr 


Tyr 


Ser 
120 


Gin 


Val 


Gin 


Cys 


His 


Ser 


Tyr 


Thr 


Gly 


Tyr 


Cys 


Trp 


Cys 


Val 


Thr 








125 










130 










135 


Pro 


Asn 


Gly Arg 


Pro 


He 


Ser 


Gly 


Thr 


Ala 


Val 


Ala 


His 


Lys 


Thr 










140 










145 










150 


Pro 


Arg 


Cys 


Pro 


Gly 


Ser 


Val 


Asn 


Glu 


Lys 


Leu 


Pro 


Gin 


Arg 


Glu 






155 










160 










165 


Gly 


Thr 


Gly 


Lys 


Thr 


Asp 


Asp 


Ala 


Ala 


Ala 


Pro 


Ala 


Leu 


Glu 


Thr 








170 










175 










180 


Gin 


Pro 


Gin 


Gly 


Asp 


Glu 


Glu 


Asp 


He 


Ala 


Ser 


Arg 


Tyr 


Pro 


Thr 








185 










190 










195 


Leu 


Trp 


Thr 


Glu 


Gin 


Val 


Lys 


Ser 


Arg 


Gin 


Asn 


Lys 


Thr 


Asn 


Lys 








200 










205 










210 


Asn 


Ser 


Val 


Ser 


Ser 
215 


Cys 


Asp 


Gin 


Glu 


His 
220 


Gin 


Ser 


Ala 


Leu 


Glu 
225 


Glu 


Ala 


Lys 


Gin 


Pro 


Lys 


Asn 


Asp 


Asn 


Val 


Val 


He 


Pro 


Glu 


Cys 








230 










235 










240 


Ala 


His 


Gly 


Gly 


Leu 


Tyr 


Lys 


Pro 


Val 


Gin 


Cys 


His 


Pro 


Ser 


Thr 






245 










250 










255 


Gly 


Tyr 


Cys 


Trp 


Cys 


Val 


Leu 


Val 


Asp 


Thr 


Gly 


Arg 


Pro 


He 


Pro 






260 










265 










270 


Gly 


Thr 


Ser 


Thr 


Arg 


Tyr 


Glu 


Gin 


Pro 


Lys 


Cys 


Asp 


Asn 


Thr 


Gly 








275 










280 










285 


Gin 


Gly 


Pro 


Pro 


Ser 


Gin 


Ser 


Pro 


Gly 


Pro 


Val 


Gin 


Gly 


Pro 


Pro 








290 










295 










300 


Ala 


Thr 


Arg 


Leu 


Ser 


Gly 


Cys 


Gin 


Lys 


Ala 
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305 



310 



(2) INFORMATION FOR SEQ ID NO: 



59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THYMNOT04 

(B) CLONE: 2652271 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 : 

Met Arg Pro Ala Ala Leu Arg Gly Ala Leu Leu Gly Cys Leu Cys 
5 10 15 

Leu Ala Leu Leu Cys Leu Gly Gly Ala Asp Lys Arg Leu Arg Asp 
20 25 30 

Asn His Glu Trp Lys Lys Leu lie Met Val Gin His Trp Pro Glu 
35 40 45 

Thr Val Cys Glu Lys lie Gin Asn Asp Cys Arg Asp Pro Pro Asp 
50 55 60 

Tyr Trp Thr lie His Gly Leu Trp Pro Asp Lys Ser Glu Gly Cys 
65 70 75 

Asn Arg Ser Trp Pro Phe Asn Leu Glu Glu lie Lys Asp Leu Leu 
80 85 90 

Pro Glu Met Arg Ala Tyr Trp Pro Asp Val lie His Ser Phe Pro 
95 100 105 

Asn Arg Ser Arg Phe Trp Lys His Glu Trp Glu Lys His Gly Thr 
110 115 120 

Cys Ala Ala Gin Val Asp Ala Leu Asn Ser Gin Lys Lys Tyr Phe 
125 130 135 

Gly Arg Ser Leu Glu Leu Tyr Arg Glu Leu Asp Leu Asn Ser Val 
140 145 150 

Leu Leu Lys Leu Gly lie Lys Pro Ser lie Asn Tyr Tyr Gin Val 
155 160 165 

Ala Asp Phe Lys Asp Ala Leu Ala Arg Val Tyr Gly Val lie Pro 
170 175 180 

Lys lie Gin Cys Leu Pro Pro Ser Gin Asp Glu Glu Val Gin Thr 
185 190 195 

lie Gly Gin He Glu Leu Cys Leu Thr Lys Gin Asp Gin Gin Leu 
200 205 210 

Gin Asn Cys Thr Glu Pro Gly Glu Gin Pro Ser Pro Lys Gin Glu 
215 220 225 

Val Trp Leu Ala Asn Gly Ala Ala Glu Ser Arg Gly Leu Arg Val 
230 235 240 

Cys Glu Asp Gly Pro Val Phe Tyr Pro Pro Pro Lys Lys Thr Lys 
245 250 255 

His 



(2) INFORMATION FOR SEQ ID NO: 60: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 160 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUTll 

(B) CLONE: 2746976 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 : 



Met 


Gin 


Phe 


Met 


Leu 

5 


Leu 


Phe 


Ser 


Arg 


Gin 
10 


Gly 


Lys 


Leu 


Arg 


Leu 
15 


Gin 


Lys 


Trp 


Tyr 


Val 


Pro 


Leu 


Ser 


Asp 


Lys 


Glu 


Lys 


Arg 


Lys 


He 




20 










25 










30 


Thr 


Arg 


Glu 


Leu 


Val 


Gin 


Thr 


Val 


Leu 


Ala 


Arg 


Lys 


Pro 


Lys 


Met 








35 










40 










45 


Cys 


Ser 


Phe 


Leu 


Glu 


Trp 


Arg 


Asp 


Leu 


Lys 


He 


Val 


Tyr 


Lys 


Arg 








50 










55 










60 


Tyr 


Ala 


Ser 


Leu 


Tyr 


Phe 


Cys 


Cys 


Ala 


He 


Glu 


Asp 


Gin 


Asp 


Asn 








65 










70 










75 


Glu 


Leu 


He 


Thr 


Leu 


Glu 


He 


He 


His 


Arg 


Tyr 


Val 


Glu 


Leu 


Leu 








80 










85 










90 


Asp 


Lys 


Tyr 


Phe 


Gly 


Ser 


Val 


Cys 


Glu 


Leu 


Asp 


He 


He 


Phe 


Asn 




95 










100 










105 


Phe 


Glu 


Lys 


Ala 


Tyr 


Phe 


He 


Leu 


Asp 


Glu 


Phe 


Leu 


Leu 


Gly 


Gly 








110 










115 










120 


Glu 


Val 


Gin 


Glu 


Thr 


Ser 


Lys 


Lys 


Asn 


Val 


Leu 


Lys 


Ala 


He 


Glu 










125 








130 










135 


Gin 


Ala 


Asp 


Leu 


Leu 


Gin 


Glu 


Asp 


Ala 


Lys 


Glu 


Ala 


Glu 


Thr 


Pro 








140 










145 










150 


Arg 


Ser 


Val 


Leu 


Glu 


Glu 


He 


Gly 


Leu 


Thr 


















155 










160 














(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 341 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP1AZS08 

(B) CLONE: 2753496 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 : 

Met Lys Arg Ala Leu Gly Arg Arg Lys Gly Val Trp Leu Arg Leu 

5 10 15 

Arq Lys He Leu Phe Cys Val Leu Gly Leu Tyr He Ala He Pro 

20 25 30 

Phe Leu He Lys Leu Cys Pro Gly He Gin Ala Lys Leu He Phe 

35 40 45 

Leu Asn Phe Val Arg Val Pro Tyr Phe He Asp Leu Lys Lys Pro 

50 55 60 

Gin Asp Gin Gly Leu Asn His Thr Cys Asn Tyr Tyr Leu Gin Pro 

65 70 75 

Glu Glu Asp Val Thr He Gly Val Trp His Thr Val Pro Ala Val 
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80 






Trp 


Trp 


Lys 


Asn 


Ala 
95 


Gin 


Gly 


Ala 


Leu 


Ala 


Ser 


Ser 
110 


His 


Pro 


Ala 


Gly 


Thr 


Arg 


Gly 
125 


Gly 


Asp 


Leu 


Ser 


Ser 


Leu 


Gly 
140 


Tyr 


His 


Trp 


Gly 


Asp 


Ser 


Val 
155 


Gly 


Thr 


Asp 


Ala 


Leu 


His 


Val 
170 


Phe 


Asp 


Asn 


Pro 


Val 


Tyr 


He 
185 


Trp 


Gly 


Thr 


Asn 


Leu 


Val 


Arg 
200 


Arg 


Leu 


Ala 


Leu 


He 


Leu 


Glu 
215 


Ser 


Pro 


Lys 


Ser 


His 


Pro 


Phe 


Ser 


Val 








230 






Asp 


Trp 


Phe 


Phe 


Leu 

245 


Asp 


Pro 


Ala 


Asn 


Asp 


Glu 


Asn 
260 


Val 


Lys 


Leu 


His 


Ala 


Glu 


Asp 
275 


Asp 


Pro 


Lys 


Leu 


Tyr 


Ser 


He 
290 


Ala 


Ala 


Lys 


Val 


Gin 


Phe 


Val 


Pro 


Phe 








305 






Lys 


Tyr 


He 


Tyr 


Lys 
320 


Ser 


Pro 


Phe 


Leu 


Gly 


Lys 


Ser 
335 


Glu 


Pro 








85 










90 


Lys 


Asp 


Gin 
100 


Met 


Trp 


Tyr 


Glu 


Asp 
105 


He 


He 


Leu 
115 


Tyr 


Leu 


His 


Gly 


Asn 
120 


His 


Arg 


Val 

130 


Glu 


Leu 


Tyr 


Lys 


Val 
135 


Val 


Val 


Thr 
145 


Phe 


Asp 


Tyr 


Arg 


Gly 
150 


Pro 


Ser 


Glu 
160 


Arg 


Gly 


Met 


Thr 


Tyr 

165 


Trp 


He 


Lys 
175 


Ala 


Arg 


Ser 


Gly 


Asp 
180 


His 


Ser 


Leu 


Gly 


Thr 


Gly Val 


Ala 






190 










195 


Cys 


Glu 


Arg 


Glu 


Thr 


Pro 


Pro 


Asp 




205 










210 


Phe 


Thr 


Asn 

220 


He 


Arg 


Glu 


Glu 


Ala 
225 


He 


Tyr 


Arg 
235 


Tyr 


Phe 


Pro 


Gly 


Phe 
240 


He 


Thr 


Ser 
250 


Ser 


Gly 


He 


Lys 


Phe 
255 


His 


He 


Ser 
265 


Cys 


Pro 


Leu 


Leu 


He 
270 


Val 


Val 


Pro 

280 


Phe 


Gin 


Leu 


Gly 


Arg 
285 


Pro 


Ala 


Arg 

295 


Ser 


Phe 


Arg 


Asp 


Phe 
300 


His 


Ser 


Asp 
310 


Leu 


Gly 


Tyr 


Arg 


His 
315 


Glu 


Leu 


Pro 
325 


Arg 


He 


Leu 


Arg 


Glu 
330 


Glu 


His 


Gin 
340 


His 











(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARTUT03 

(B) CLONE: 2781553 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 : 



Met 


Ala 


Glu 


Gly 


Glu 
5 


Asp 


Val 


Gly 


Trp 


Trp 
10 


Arg 


Ser 


Trp 


Leu 


Gin 
15 


Gin 


Ser 


Tyr 


Gin 


Ala 


Val 


Lys 


Glu 


Lys 


Ser 


Ser 


Glu 


Ala 


Leu 


Glu 








20 










25 










30 


Phe 


Met 


Lys 


Arg 


Asp 


Leu 


Thr 


Glu 


Phe 


Thr 


Gin 


Val 


Val 


Gin 


His 






35 










40 










45 


Asp 


Thr 


Ala 


Cys 


Thr 


He 


Ala 


Ala 


Thr 


Ala 


Ser 


Val 


Val 


Lys 


Glu 






50 










55 










60 
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Lys 


Leu 


Ala 


Thr 


Glu 


Gly 


Ser 


Ser 


Gly Ala 


Thr 


Glu 


Lys 


Met 


Lys 








65 










70 










75 


Lys 


Gly 


Leu 


Ser 


Asp 


Phe 


Leu 


Gly 


Val 


He 


Ser 


Asp 


Thr 


Phe 


Ala 






80 










85 










90 


Pro 


Ser 


Pro 


Asp 


Lys 
95 


Thr 


He 


Asp 


Cys 


Asp 
100 


Val 


He 


Thr 


Leu 


Met 
105 


Gly 


Thr 


Pro 


Ser 


Gly 


Thr 


Ala 


Glu 


Pro 


Tyr 


Asp 


Gly 


Thr 


Lys 


Ala 








110 










115 










120 


Arg 


Leu 


Tyr 


Ser 


Leu 


Gin 


Ser 


Asp 


Pro 


Ala 


Thr 


Tyr 


Cys 


Asn 


Glu 






125 










130 










135 


Pro 


Asp 


Gly 


Pro 


Pro 


Glu 


Leu 


Phe 


Asp 


Ala 


Trp 


Leu 


Ser 


Gin 


Phe 






140 










145 










150 


Cys 


Leu 


Glu 


Glu 


Lys 


Lys 


Gly 


Glu 


He 


Ser 


Glu 


Leu 


Leu 


Val 


Gly 








155 










160 










165 


Ser 


Pro 


Ser 


He 


Arg 
170 


Ala 


Leu 


Tyr 


Thr 


Lys 
175 


Met 


Val 


Pro 


Ala 


Ala 
180 


Val 


Ser 


His 


Ser 


Glu 
185 


Phe 


Trp 


His 


Arg 


Tyr 
190 


Phe 


Tyr 


Lys 


Val 


His 
195 


Gin 


Leu 


Glu 


Gin 


Glu 
200 


Gin 


Ala 


Arg 


Arg 


Asp 
205 


Ala 


Leu 


Lys 


Gin 


Arg 
210 


Ala 


Glu 


Gin 


Ser 


He 
215 


Ser 


Glu 


Glu 


Pro 


Gly 
220 


Trp 


Glu 


Glu 


Glu 


Glu 
225 


Glu 


Glu 


Leu 


Met 


Gly 
230 


He 


Ser 


Pro 


He 


Ser 
235 


Pro 


Lys 


Glu 


Ala 


Lys 
240 


Val 


Pro 


Val 


Ala 


Lys 
245 


He 


Ser 


Thr 


Phe 


Pro 
250 


Glu 


Gly 


Glu 


Pro 


Gly 
255 


Pro 


Gin 


Ser 


Pro 


Cys 
260 


Glu 


Glu 


Asn 


Leu 


Val 
265 


Thr 


Ser 


Val 


Glu 


Pro 
270 


Pro 


Ala 


Glu 


Val 


Thr 
275 


Pro 


Ser 


Glu 


Ser 


Ser 
280 


Glu 


Ser 


He 


Ser 


Leu 
285 


Val 


Thr 


Gin 


He 


Ala 
290 


Asn 


Pro 


Ala 


Thr 


Ala 
295 


Pro 


Glu 


Ala 


Arg 


Val 
300 


Leu 


Pro 


Lys 


Asp 


Leu 


Ser 


Gin 


Lys 


Leu 


Leu 


Glu 


Ala 


Ser 


Leu 


Glu 






305 










310 










315 


Glu 


Gin 


Gly 


Leu 


Ala 


Val 


Asp 


Val 


Gly 


Glu 


Thr 


Gly 


Pro 


Ser 


Pro 








320 










325 










330 


Pro 


He 


His 


Ser 


Lys 
335 


Pro 


Leu 


Thr 


Pro 


Ala 
340 


Gly 


His 


Thr 


Gly 


Gly 
345 


Pro 


Glu 


Pro 


Arg 


Pro 


Pro 


Ala 


Arg 


Val 


Glu 


Thr 


Leu 


Arg 


Glu 


Glu 








350 










355 










360 


Ala 


Pro 


Thr 


Asp 


Leu 


Arg 


Val 


Phe 


Glu 


Leu 


Asn 


Ser 


Asp 


Ser 


Gly 








365 










370 










375 


Lys 


Ser 


Thr 


Pro 


Ser 


Asn 


Asn 


Gly 


Lys 


Lys 


Gly 


Ser 


Ser 


Thr 


Asp 








380 










385 










390 


He 


Ser 


Glu 


Asp 


Trp 

395 


Glu 


Lys 


Asp 


Phe 


Asp 
400 


Leu 


Asp 


Met 


Thr 


Glu 
405 


Glu 


Glu 


Val 


Gin 


Met 
410 


Ala 


Leu 


Ser 


Lys 


Val 
415 


Asp 


Ala 


Ser 


Gly 


Glu 
420 


Leu 


Glu 


Asp 


Val 


Glu 
425 


Trp 


Glu 


Asp 


Trp 


Glu 
430 













(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

CC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ADRETUTO 6 

(B) CLONE: 2821925 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 : 



Met 


Gly 


Pro 


Val 


Arg 
5 


Leu 


Gly 


He 


Leu 


Leu 
10 


Phe 


Leu 


Phe 


Leu 


Ala 
15 


Val 


His 


Glu 


Ala 


Trp 
20 


Ala 


Gly 


Met 


Leu 


Lys 
25 


Glu 


Glu 


Asp 


Asp 


Asp 
30 


Thr 


Glu 


Arg 


Leu 


Pro 
35 


Ser 


Lys 


Cys 


Glu 


Val 
40 


Cys 


Lys 


Leu 


Leu 


Ser 

A C 

4 o 


Thr 


Glu 


Leu 


Gin 


Ala 
50 


Glu 


Leu 


Ser 


Arg 


Thr 
55 


Gly 


Arg 


Ser 


Arg 


Glu 
60 


Val 


Leu 


Glu 


Leu 


Gly 
65 


Gin 


Val 


Leu 


Asp 


Thr 
70 


Gly 


Lys 


Arg 


Lys 


Arg 
75 


His 


Val 


Pro 


Tyr 


Ser 
80 


Val 


Ser 


Glu 


Thr 


Arg 
85 


Leu 


Glu 


Glu 


Ala 


Leu 
90 


Glu 


Asn 


Leu 


Cys 


Glu 
95 


Arg 


He 


Leu 


Asp 


Tyr 
100 


Ser 


Val 


His 


Ala 


Glu 
105 


Arg 


Lys 


Gly 


Ser 


Leu 

110 


Arg 


Tyr 


Ala 


Lys 


Gly 
115 


Gin 


Ser 


Gin 


Thr 


Met 
120 


Ala 


Thr 


Leu 


Lys 


Gly 
125 


Leu 


Val 


Gin 


Lys 


Gly 

130 


Val 


Lys 


Val 


Asp 


Leu 
135 


Gly 


He 


Pro 


Leu 


Glu 


Leu 


Leu 


Gly 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 301 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UTRSTUT05 

(B) CLONE: 2879068 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 : 



Met 


Glu 


Asp 


Met 


Asn 
5 


Glu 


Tyr 


Ser 


Asn 


He 
10 


Glu 


Glu 


Phe 


Ala 


Glu 
15 


Gly 


Ser 


Lys 


He 


Asn 
20 


Ala 


Ser 


Lys 


Asn 


Gin 
25 


Gin 


Asp 


Asp 


Gly 


Lys 
30 


Met 


Phe 


He 


Gly 


Gly 
35 


Leu 


Ser 


Trp 


Asp 


Thr 
40 


Ser 


Lys 


Lys 


Asp 


Leu 
45 


Thr 


Glu 


Tyr 


Leu 


Ser 
50 


Arg 


Phe 


Gly 


Glu 


Val 
55 


Val 


Asp 


Cys 


Thr 


He 
60 


Lys 


Thr 


Asp 


Pro 


Val 
65 


Thr 


Gly 


Arg 


Ser 


Arg 
70 


Gly 


Phe 


Gly 


Phe 


Val 
75 


Leu 


Phe 


Lys 


Asp 


Ala 
80 


Ala 


Ser 


Val 


Asp 


Lys 
85 


Val 


Leu 


Glu 


Leu 


Lys 
90 


Glu 


His 


Lys 


Leu 


Asp 
95 


Gly 


Lys 


Leu 


He 


Asp 
100 


Pro 


Lys 


Arg 


Ala 


Lys 
105 


Ala 


Leu 


Lys 


Gly 


Lys 
110 


Glu 


Pro 


Pro 


Lys 


Lys 
115 


Val 


Phe 


Val 


Gly 


Gly 
120 


Leu 


Ser 


Pro 


Asp 


Thr 


Ser 


Glu 


Glu 


Gin 


He 


Lys 


Glu 


Tyr 


Phe 


Gly 



140 



(2) INFORMATION 



FOR 



SEQ ID NO: 



64 : 
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125 



Ala 


Phe 


Gly 


Glu 


He 
140 


Glu 


Asn 


Thr 


Asn 


Glu 


Arg 


Arg 
155 


Gly 


Phe 


Glu 


Pro 


Val 


Lys 


Lys 
170 


Leu 


Leu 


Ser 


Gly 


Lys 


Cys 


Glu 
185 


He 


Lys 


Arg 


Gin 


Gin 


Gin 


Gin 
200 


Gin 


Gin 


Gly 


Gly 


Arg 


Gly 


Gly 
215 


Thr 


Arg 


Asn 


Trp 


Asn 


Gin 


Gly 
230 


Phe 


Asn 


Asn 


Tyr 


Asn 


Ser 


Ala 
245 


Tyr 


Gly 


Gly 


Gly 


Tyr 


Asp 


Tyr 
260 


Thr 


Gly 


Gly 


Gin 


Gly 


Tyr 


Ala 
275 


Asp 


Tyr 


Lys 


Ala 


Ser 


Arg 


Gly 


Gly 


Gly 



290 

Tyr 







130 










135 


He 


Glu 


Leu 
145 


Pro 


Met 


Asp 


Thr 


Lys 
150 


Cys 


Phe 


He 
160 


Thr 


Tyr 


Thr 


Asp 


Glu 
165 


Glu 


Ser 


Arg 
175 


Tyr 


His 


Gin 


He 


Gly 
180 


Val 


Ala 


Gin 
190 


Pro 


Lys 


Glu 


Val 


Tyr 
195 


Lys 


Gly 


Gly 
205 


Arg 


Gly 


Ala 


Ala 


Ala 
210 


Gly 


Arg 


Gly 
220 


Arg 


Gly 


Gin 


Gly 


Gin 
225 


Asn 


Tyr 


Tyr 

235 


Asp 


Gin 


Gly 


Tyr 


Gly 
240 


Gly 


Asp 


Gin 
250 


Asn 


Tyr 


Ser 


Gly 


Tyr 
255 


Tyr 


Asn 


Tyr 
265 


Gly 


Asn 


Tyr 


Gly 


Tyr 
270 


Ser 


Gly 


Gin 
280 


Gin 


Ser 


Thr 


Tyr 


Gly 
285 


Asn 


His 


Gin 
295 


Asn 


Asn 


Tyr 


Gin 


Pro 
300 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SINJNOT02 

(B) CLONE: 2886757 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 : 



Met 


Gly 


Glu 


Pro 


Gin 


Gin 


Val 


Ser 


Ala 


Leu 


Pro 


Pro 


Pro 


Pro 


Met 








5 










10 










15 


Gin 


Tyr 


He 


Lys 


Glu 


Tyr 


Thr 


Asp 


Glu 


Asn 


He 


Gin 


Glu 


Gly 


Leu 








20 










25 










30 


Ala 


Pro 


Lys 


Pro 


Pro 


Pro 


Pro 


He 


Lys 


Asp 


Ser 


Tyr 


Met 


Met 


Phe 








35 










40 










45 


Gly 


Asn 


Gin 


Phe 


Gin 


Cys 


Asp 


Asp 


Leu 


He 


He 


Arg 


Pro 


Leu 


Glu 








50 










55 










60 


Ser 


Gin 


Gly 


He 


Glu 
65 


Arg 


Leu 


His 


Pro 


Met 
70 


Gin 


Phe 


Asp 


His 


Lys 
75 


Lys 


Glu 


Leu 


Arg 


Lys 


Leu 


Asn 


Met 


Ser 


He 


Leu 


He 


Asn 


Phe 


Leu 








80 










85 










90 


Asp 


Leu 


Leu 


Asp 


He 
95 


Leu 


He 


Arg 


Ser 


Pro 
100 


Gly 


Ser 


He 


Lys 


Arg 
105 


Glu 


Glu 


Lys 


Leu 


Glu 


Asp 


Leu 


Lys 


Leu 


Leu 


Phe 


Val 


His 


Val 


His 








110 










115 










120 


His 


Leu 


He 


Asn 


Glu 
125 


Tyr 


Arg 


Pro 


His 


Gin 

130 


Ala 


Arg 


Glu 


Thr 


Leu 

135 


Arg 


Val 


Met 


Met 


Glu 
140 


Val 


Gin 


Lys 


Arg 


Gin 
145 


Arg 


Leu 


Glu 


Thr 


Ala 
150 
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Glu 


Arg 


Phe 


Gin 


Lys 


His 


Leu 


Glu 


Arg 


Val 


He 


Glu 


Met 


He 


Gin 








155 










160 










165 


Asn 


Cys 


Leu 


Ala 


Ser 


Leu 


Pro 


Asp 


Asp 


Leu 


Pro 


His 


Ser 


Glu 


Ala 








170 










I/O 










1 Ofl 


Gly Met 


Arg 


Val 


Lys 


Thr 


Glu 


Pro 


Met 


Asp 


Ala 


Asp 


Asp 


Ser 


Asn 








185 










190 










195 


Asn 


Cys 


Thr 


Gly 


Gin 


Asn 


Glu 


His 


Gin 


Arg 


Glu 


Asn 


Ser 


Gly 


His 






200 










205 










210 


Arg 


Arg 


Asp 


Gin 


He 


He 


Glu 


Lys 


Asp 


Ala 


Ala 


Leu 


Cys 


Val 


Leu 




215 










220 










225 


He 


Asp 


Glu 


Met 


Asn 


Glu 


Arg 


Pro 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 354 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SCORNOT04 

(B) CLONE: 2964329 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66 : 



Met 


Ala 


Gly 


Ala 


Gly 


Ala 


Gly 


Ala 


Gly Ala 


Arg 


Gly 


Gly Ala 


Ala 








5 










10 










15 


Ala 


Gly 


Val 


Glu 


Ala 


Arg 


Ala 


Arg 


Asp 


Pro 


Pro 


Pro 


Ala 


His 


Arg 








20 










25 










30 


Ala 


His 


Pro 


Arg 


His 


Pro 


Arg 


Pro 


Ala 


Ala 


Gin 


Pro 


Ser 


Ala 


Arg 








35 










40 










45 


Arg 


Met 


Asp 


Gly 


Gly 


Ser 


Gly 


Gly 


Leu 


Gly 


Ser 


Gly 


Asp 


Asn 


Ala 




50 










55 










60 


Pro 


Thr 


Thr 


Glu 


Ala 


Leu 


Phe 


Val 


Ala 


Leu 


Gly Ala 


Gly 


Val 


Thr 










65 










70 










75 


Ala 


Leu 


Ser 


His 


Pro 


Leu 


Leu 


Tyr 


Val 


Lys 


Leu 


Leu 


He 


Gin 


Val 










80 








85 










90 


Gly 


His 


Glu 


Pro 


Met 


Pro 


Pro 


Thr 


Leu 


Gly 


Thr 


Asn 


Val 


Leu 


Gly 








95 










100 










105 


Arg 


Lys 


Val 


Leu 


Tyr 


Leu 


Pro 


Ser 


Phe 


Phe 


Thr 


Tyr 


Ala 


Lys 


Tyr 






110 










115 










120 


He 


Val 


Gin 


Val 


Asp 
125 


Gly 


Lys 


He 


Gly 


Leu 
130 


Phe 


Arg 


Gly 


Leu 


Ser 
135 


Pro 


Arg 


Leu 


Met 


Ser 


Asn 


Ala 


Leu 


Ser 


Thr 


Val 


Thr 


Arg 


Gly 


Ser 








140 










145 










150 


Met 


Lys 


Lys 


Val 


Phe 


Pro 


Pro 


Asp 


Glu 


He 


Glu 


Gin 


Val 


Ser 


Asn 






155 










160 










165 


Lys 


Asp 


Asp 


Met 


Lys 


Thr 


Ser 


Leu 


Lys 


Lys 


Val 


Val 


Lys 


Glu 


Thr 






170 










175 










180 


Ser 


Tyr 


Glu 


Met 


Met 


Met 


Gin 


Cys 


Val 


Ser 


Arg 


Met 


Leu 


Ala 


His 








185 










190 










195 


Pro 


Leu 


His 


Val 


He 
200 


Ser 


Met 


Arg 


Cys 


Met 
205 


Val 


Gin 


Phe 


Val 


Gly 
210 


Arg 


Glu 


Ala 


Lys 


Tyr 


Ser 


Gly 


Val 


Leu 


Ser 


Ser 


He 


Gly 


Lys 


He 








215 










220 










225 


Phe 


Lys 


Glu 


Glu 


Gly 


Leu 


Leu 


Gly 


Phe 


Phe 


Val 


Gly 


Leu 


He 


Pro 



230 



(2) INFORMATION FOR 



SEQ 



ID NO: 



66: 



178 



# 



PF-0459 US 











230 






His 


Leu 


Leu 


Gly 


Asp 
245 


Val 


Val 


Ala 


His 


Phe 


He 


Asn 
260 


Ala 


Tyr 


Ala 


Leu 


Ala 


He 


Arg 
275 


Ser 


Tyr 


Val 


Ser 


Met 


Leu 


Thr 
290 


Tyr 


Pro 


Ala 


Val 


Asn 


Asn 


Cys 
305 


Gly 


Leu 


Pro 


Val 


Phe 


Lys 


Ser 
320 


Trp 


He 


Gin 


Gly 


Gin 


Leu 


Phe 
335 


Arg 


Gly 


Ser 


Ser 


Gly 


Ser 


Cys 
350 


Phe 


Ala 







235 










240 


Phe 


Leu 


Trp 


Gly 


Cys 


Asn 


Leu 


Leu 






250 










255 


Leu 


Val 


Asp 


Asp 


Ser 


Phe 


Ser 


Gin 






265 










270 


Thr 


Lys 


Phe 


Val 


Met 


Gly 


He 


Ala 






280 










285 


Phe 


Leu 


Leu 


Val 


Gly Asp 


Leu 


Met 






295 










300 


Gin 


Ala 


Gly 


Leu 


Pro 


Pro 


Tyr 


Ser 






310 










315 


His 


Cys 


Trp 


Lys 


Tyr 


Leu 


Ser 


Val 






325 










330 


Ser 


Ser 


Leu 


Leu 


Phe 


Arg 


Arg 


Val 






340 










345 


Leu 


Glu 















(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 235 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SCORNOT04 

(B) CLONE: 2965248 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 : 



Met 


Ala 


Ser 


Thr 


He 
5 


Ser 


Ala 


Tyr 


Lys 


Glu 
10 


Lys 


Met 


Lys 


Glu 


Leu 
15 


Ser 


Val 


Leu 


Ser 


Leu 
20 


He 


Cys 


Ser 


Cys 


Phe 
25 


Tyr 


Thr 


Gin 


Pro 


His 
30 


Pro 


Asn 


Thr 


Val 


Tyr 
35 


Gin 


Tyr 


Gly 


Asp 


Met 
40 


Glu 


Val 


Lys 


Gin 


Leu 
45 


Asp 


Lys 


Arg 


Ala 


Ser 


Gly 


Gin 


Ser 


Phe 


Glu 


Val 


He 


Leu 


Lys 


Ser 






50 










55 










60 


Pro 


Ser 


Asp 


Leu 


Ser 


Pro 


Glu 


Ser 


Pro 


Met 


Leu 


Ser 


Ser 


Pro 


Pro 








65 










70 










75 


Lys 


Lys 


Lys 


Asp 


Thr 


Ser 


Leu 


Glu 


Glu 


Leu 


Gin 


Lys 


Arg 


Leu 


Glu 




80 










85 










90 


Ala 


Ala 


Glu 


Glu 


Arg 
95 


Arg 


Lys 


Thr 


Gin 


Glu 
100 


Ala 


Gin 


Val 


Leu 


Lys 
105 


Gin 


Leu 


Ala 


Asp 


Gly 


Ala 


Ser 


Thr 


Ser 


Ala 


Arg 


Cys 


Cys 


Thr 


Arg 








110 










115 










120 


Arg 


Trp 


Arg 


Arg 


He 


Thr 


Thr 


Ser 


Ala 


Ala 


Arg 


Arg 


Arg 


Arg 


Ser 






125 










130 










135 


Ser 


Thr 


Thr 


Arg 


Trp 
140 


Ser 


Ser 


Ala 


Arg 


Arg 
145 


Ser 


Ala 


Arg 


His 


Thr 
150 


Trp 


Pro 


His 


Cys 


Ala 


Ser 


Gly 


Cys 


Ala 


Arg 


Arg 


Ser 


Cys 


Thr 


Arg 






155 










160 










165 


Pro 


Arg 


Cys 


Ala 


Gly 
170 


Thr 


Arg 


Ser 


Ser 


Glu 
175 


Lys 


Arg 


Cys 


Arg 


Ala 
180 


Lys 


Gly 


Pro 


Gly 


Arg 


Ala 


Ala 


Pro 


He 


Leu 


Arg 


Arg 


Asn 


Thr 


Phe 








185 










190 










195 



179 





PF-0459 US 

Gly Phe Trp Phe Cys Phe Val His Leu Cys Leu Asp Ala Thr Phe 
200 205 210 

Val Pro Pro Pro Pro Pro Gin Pro Pro Ala Ser Cys Phe Ser Ser 
215 220 225 

Ala Leu Ser Arg Pro Ala Leu Ser Ser Trp 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TLYMNOT0 6 

(B) CLONE: 3000534 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 : 

Met Trp Ser Ala Gly Arg Gly Gly Ala Ala Trp Pro Val Leu Leu 
5 10 15 

Gly Leu Leu Leu Ala Leu Leu Val Pro Gly Gly Gly Ala Ala Lys 
20 25 30 

Thr Gly Ala Glu Leu Val Thr Cys Gly Ser Val Leu Lys Leu Leu 
35 40 45 

Asn Thr His His Arg Val Arg Leu His Ser His Asp He Lys Tyr 
50 55 60 

Gly Ser Gly Ser Gly Gin Gin Ser Val Thr Gly Val Glu Ala Ser 
65 70 75 

Asp Asp Ala Asn Ser Tyr Trp Arg He Arg Gly Gly Ser Glu Gly 
80 85 90 

Gly Cys Pro Arg Gly Ser Pro Val Arg Cys Gly Gin Ala Val Arg 
95 100 105 

Leu Thr His Val Leu Thr Gly Lys Asn Leu His Thr His His Phe 
110 H5 120 

Pro Ser Pro Leu Ser Asn Asn Gin Glu Val Ser Ala Phe Gly Glu 
125 130 135 

Asp Gly Glu Gly Asp Asp Leu Asp Leu Trp Thr Val Arg Cys Ser 
140 145 150 

Gly Gin His Trp Glu Arg Glu Ala Ala Val Arg Phe Gin His Val 
155 160 165 

Gly Thr Ser Val Phe Leu Ser Val Thr Gly Glu Gin Tyr Gly Ser 
170 175 180 

Pro He Arg Gly Gin His Glu Val His Gly Met Pro Ser Ala Asn 
185 190 195 

Thr His Asn Thr Trp Lys Ala Met Glu Gly He Phe He Lys Pro 
200 205 210 

Ser Val Glu Pro Ser Ala Gly His Asp Glu Leu 



230 



235 



(2) INFORMATION FOR 



SEQ ID NO: 



68: 



215 



220 



(2) INFORMATION FOR SEQ ID NO: 69: 
(i) SEQUENCE CHARACTERISTICS: 

180 




PF-0459 US 

(A) LENGTH: 4 83 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAANOT01 

(B) CLONE: 3046870 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 : 



Met 


Lys 


Ala 


Phe 


His 


Thr 


Phe 


Cys 


Val 


Val 


Leu 


Leu 


Val 


Phe 


Gly 








5 










10 










15 


Ser 


Val 


Ser 


Glu 


Ala 


Lys 


Phe 


Asp 


Asp 


Phe 


Glu 


Asp 


Glu 


Glu 


Asp 










20 








25 










30 


He 


Val 


Glu 


Tyr 


Asp 


Asp 


Asn 


Asp 


Phe 


Ala 


Glu 


Phe 


Glu 


Asp 


Val 








35 










40 










45 


Met 


Glu 


Asp 


Ser 


Val 


Thr 


Glu 


Ser 


Pro 


Gin 


Arg 


Val 


He 


He 


Thr 








50 










55 










60 


Glu 


Asp 


Asp 


Glu 


Asp 


Glu 


Thr 


Thr 


Val 


Glu 


Leu 


Glu 


Gly 


Gin 


Asp 






65 










70 










75 


Glu 


Asn 


Gin 


Glu 


Gly Asp 


Phe 


Glu 


Asp 


Ala 


Asp 


Thr 


Gin 


Glu 


Gly 










80 










85 










90 


Asp 


Thr 


Glu 


Ser 


Glu 


Pro 


Tyr 


Asp 


Asp 


Glu 


Glu 


Phe 


Glu 


Gly 


Tyr 








95 










100 










105 


Glu 


Asp 


Lys 


Pro 


Asp 


Thr 


Ser 


Ser 


Ser 


Lys 


Asn 


Lys 


Asp 


Pro 


He 






110 










115 










120 


Thr 


He 


Val 


Asp 


Val 


Pro 


Ala 


His 


Leu 


Gin 


Asn 


Ser 


Trp 


Glu 


Ser 








125 










130 










135 


Tyr 


Tyr 


Leu 


Glu 


He 


Leu 


Met 


Val 


Thr 


Gly 


Leu 


Leu 


Ala 


Tyr 


He 






140 










145 










150 


Met 


Asn 


Tyr 


He 


He 


Gly 


Lys 


Asn 


Lys 


Asn 


Ser 


Arg 


Leu 


Ala 


Gin 








155 










160 










165 


Ala 


Trp 


Phe 


Asn 


Thr 


His 


Arg 


Glu 


Leu 


Leu 


Glu 


Ser 


Asn 


Phe 


Thr 








170 










175 










180 


Leu 


Val 


Gly Asp 


Asp 


Gly 


Thr 


Asn 


Lys 


Glu 


Ala 


Thr 


Ser 


Thr 


Gly 










185 










190 










195 


Lys 


Leu 


Asn 


Gin 


Glu 


Asn 


Glu 


His 


He 


Tyr 


Asn 


Leu 


Trp 


Cys 


Ser 








200 










205 










210 


Gly Arq 


Val 


Cys 


Cys 


Glu 


Gly 


Met 


Leu 


He 


Gin 


Leu 


Arg 


Phe 


Leu 








215 










220 










225 


Lys 


Arg 


Gin 


Asp 


Leu 


Leu 


Asn 


Val 


Leu 


Ala 


Arg 


Met 


Met 


Arg 


Pro 




230 










235 










240 


Val 


Ser 


Asp 


Gin 


Val 


Gin 


He 


Lys 


Val 


Thr 


Met 


Asn 


Asp 


Glu 


Asp 








245 










250 










255 


Met 


Asp 


Thr 


Tyr 


Val 


Phe 


Ala 


Val 


Gly 


Thr 


Arg 


Lys 


Ala 


Leu 


Val 






260 










265 










270 


Arg 


Leu 


Gin 


Lys 


Glu 


Met 


Gin 


Asp 


Leu 


Ser 


Glu 


Phe 


Cys 


Ser 


Asp 






275 










280 










285 


Lys 


Pro 


Lys 


Ser 


Gly 


Ala 


Lys 


Tyr 


Gly 


Leu 


Pro 


Asp 


Ser 


Leu 


Ala 






290 










295 










300 


He 


Leu 


Ser 


Glu 


Met 
305 


Gly 


Glu 


Val 


Thr 


Asp 
310 


Gly 


Met 


Met 


Asp 


Thr 
315 


Lys 


Met 


Val 


His 


Phe 


Leu 


Thr 


His 


Tyr 


Ala 


Asp 


Lys 


He 


Glu 


Ser 








320 










325 










330 


Val 


His 


Phe 


Ser 


Asp 
335 


Gin 


Phe 


Ser 


Gly 


Pro 
340 


Lys 


He 


Met 


Gin 


Glu 
345 


Glu 


Gly 


Gin 


Pro 


Leu 


Lys 


Leu 


Pro 


Asp 


Thr 


Lys 


Arg 


Thr 


Leu 


Leu 








350 










355 










360 


Phe 


Thr 


Phe 


Asn 


Val 
365 


Pro 


Gly 


■ Ser 


Gly 


Asn 
370 


Thr 


Tyr 


Pro 


Lys 


Asp 
375 



181 




PF-0459 


US 


























jyiei- 




Ala 


Leu 


Leu 
380 


Pro 


Leu 


Met 


Asn 


Met 
385 


Val 


He 


Tyr 


Ser 


He 
390 


Asp 


Lys 


Ala 


Lys 


Lys 
395 


Phe 


Arg 


Leu 


Asn 


Arg 
400 


Glu 


Gly 


Lys 


Gin 


Lys 
405 


AJ_a 


Asp 


Lys 


Asn 


Arg 
410 


Ala 


Arg 


Val 


Glu 


Glu 
415 


Asn 


Phe 


Leu 


Lys 


Leu 
420 


1 ill. 


His 


Val 


Gin 


Arg 
425 


Gin 


Glu 


Ala 


Ala 


Gin 
430 


Ser 


Arg 


Arg 


Glu 


Glu 
435 


Lys 


Lys 


Arg 


Ala 


Glu 
440 


Lys 


Glu 


Arg 


He 


Met 
4 45 


Asn 


Glu 


Glu 


Asp 


Pro 

4 OU 


Glu 


Lys 


Gin 


Arg 


Arg 
455 


Leu 


Glu 


Glu 


Ala 


Ala 
460 


Leu 


Arg 


Arg 


Glu 


Gin 
465 


Lys 


Lys 


Leu 


Glu 


Lys 
470 


Lys 


Gin 


Met 


Lys 


Met 
475 


Lys 


Gin 


He 


Lys 


Val 
480 


Lys 


Ala 


Met 



























(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PONSAZT01 

(B) CLONE: 3057669 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70 : 



Met 


Asp 


His 


Glu 


Asp 


He 


Ser 


Glu 


Ser 


Val 


Asp 


Ala 


Ala 


Tyr 


Asn 








5 










10 










15 


Leu 


Gin 


Asp 


Ser 


Cys 


Leu 


Thr 


Asp 


Cys 


Asp 


Val 


Glu 


Asp 


Gly 


Thr 








20 










25 










30 


Met 


Asp 


Gly Asn 


Asp 


Glu 


Gly 


His 


Ser 


Phe 


Glu 


Leu 


Cys 


Pro 


Ser 










35 










40 










45 


Glu 


Ala 


Ser 


Pro 


Tyr 
50 


Val 


Arg 


Ser 


Arg 


Glu 
55 


Arg 


Thr 


Ser 


Ser 


Ser 
60 


He 


Val 


Phe 


Glu 


Asp 
65 


Ser 


Gly 


Cys 


Asp 


Asn 
70 


Ala 


Ser 


Ser 


Lys 


Glu 
75 


Glu 


Pro 


Lys 


Thr 


Asn 


Arg 


Leu 


His 


He 


Gly 


Asn 


His 


Cys 


Ala 


Asn 








80 










85 










90 


Lys 


Leu 


Thr 


Ala 


Phe 


Lys 


Pro 


Thr 


Ser 


Ser 


Lys 


Ser 


Ser 


Ser 


Glu 








95 










100 










105 


Ala 


Thr 


Leu 


Ser 


He 
110 


Ser 


Pro 


Pro 


Arg 


Pro 
115 


Thr 


Thr 


Leu 


Ser 


Leu 
120 


Asp 


Leu 


Thr 


Lys 


Asn 


Thr 


Thr 


Glu 


Lys 


Leu 


Gin 


Pro 


Ser 


Ser 


Pro 






125 










130 










135 


Lys 


Val 


Tyr 


Leu 


Tyr 


He 


Gin 


Met 


Gin 


Leu 


Cys 


Arg 


Lys 


Glu 


Asn 






140 










145 










150 


Leu 


Lys 


Asp 


Trp 


Met 


Asn 


Gly Arg 


Cys 


Thr 


lie 


Glu 


Glu 


Arg 


Glu 






155 










160 










165 


Arg 


Ser 


Val 


Cys 


Leu 


His 


He 


Phe 


Leu 


Gin 


He 


Ala 


Glu 


Ala 


Val 








170 










175 










180 


Glu 


Phe 


Leu 


His 


Ser 
185 


Lys 


Gly 


Leu 


Met 


His 
190 


Arg 


Asp 


Leu 


Lys 


Pro 
195 


Ser 


Asn 


He 


Phe 


Phe 
200 


Thr 


Met 


Asp 


Asp 


Val 
205 


Val 


Lys 


Val 


Gly 


Asp 
210 


Phe 


Gly 


Leu 


Val 


Thr 


Ala 


Met 


Asp 


Gin 


Asp 


Glu 


Glu 


Glu 


Gin 


Thr 



182 



* 

PF-0459 US 











215 






Val 


Leu 


Thr 


Pro 


Met 
230 


Pro 


Ala 


Gly 


Thr 


Lys 


Leu 


Tyr 
245 


Met 


Ser 


Tyr 


Ser 


His 


Lys 


Val 
260 


Asp 


He 


Glu 


Leu 


Leu 


Tyr 


Pro 
275 


Phe 


Ser 


Leu 


Thr 


Asp 


Val 


Arg 
290 


Asn 


Leu 


Lys 


Tyr 


Pro 


Cys 


Glu 

305 


Tyr 


Val 


Ser 


Pro 


Met 


Glu 


Arg 
320 


Pro 


Glu 


Val 


Phe 


Glu 


Asp 


Leu 
335 


Asp 


Phe 


Arg 


Ser 


Arg 


Ser 


Leu 
350 


Ser 


Ser 


Ser 


Asn 


Asn 


Ser 


His 
365 


Ser 


Pro 








220 










225 


Tyr 


Ala 


Arg 


His 


Thr 


Gly 


Gin 


Val 




235 










240 


Pro 


Glu 


Gin 
250 


He 


His 


Gly 


Asn 


Ser 
255 


Phe 


Ser 


Leu 
265 


Gly 


Leu 


He 


Leu 


Phe 
270 


Thr 


Gin 


Met 
280 


Glu 


Arg 


Val 


Arg 


Thr 

285 


Lys 


Phe 


Pro 


Pro 


Leu 


Phe 


Thr 


Gin 




295 










300 


Met 


Val 


Gin 

310 


Asp 


Met 


Leu 


Ser 


Pro 
315 


Ala 


He 


Asn 

325 


He 


He 


Glu 


Asn 


Ala 
330 


Pro 


Gly 


Lys 


Thr 


Val 


Leu 


Arg 


Gin 




340 










345 


Ser 


Gly 


Thr 


Lys 


His 


Ser 


Arg 


Gin 




355 










360 


Leu 


Pro 


Ser 
370 


Asn 











(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 02 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAONOT03 

(B) CLONE: 3088178 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71 : 



Met 


Met 


Asn 


Asn 


Arg 


Phe 


Arg 


Lys 


Asp 


Met 


Met 


Lys 


Asn 


Ala 


Ser 










5 










10 










15 


Glu 


Ser 


Lys 


Leu 


Ser 


Lys 


Asp 


Asn 


Leu 


Lys 


Lys 


Arg 


Leu 


Lys 


Glu 








20 










25 










30 


Glu 


Phe 


Gin 


His 


Ala 


Met 


Gly 


Gly 


Val 


Pro 


Ala 


Trp 


Ala 


Glu 


Thr 










35 










40 










45 


Thr 


Lys 


Arg 


Lys 


Thr 


Ser 


Ser 


Asp 


Asp 


Glu 


Ser 


Glu 


Glu 


Asp 


Glu 




50 










55 










60 


Asp 


Asp 


Leu 


Leu 


Gin 


Arg 


Thr 


Gly 


Asn 


Phe 


He 


Ser 


Thr 


Ser 


Thr 






65 










70 










75 


Ser 


Leu 


Pro 


Arg 


Gly 


He 


Leu 


Lys 


Met 


Lys 


Asn 


Cys 


Gin 


His 


Ala 








80 










85 










90 


Asn 


Ala 


Glu 


Arg 


Pro 


Thr 


Val 


Ala 


Arg 


He 


Ser 


Ser 


Val 


Gin 


Phe 








95 










100 










105 


His 


Pro 


Gly 


Ala 


Gin 


He 


Val 


Met 


Val 


Ala 


Gly 


Leu 


Asp 


Asn 


Ala 








110 










115 










120 


Val 


Ser 


Leu 


Phe 


Gin 


Val 


Asp 


Gly 


Lys 


Thr 


Asn 


Pro 


Lys 


He 


Gin 










125 










130 










135 


Ser 


He 


Tyr 


Leu 


Glu 


Arg 


Phe 


Pro 


He 


Phe 


Lys 


Ala 


Cys 


Phe 


Ser 








140 








145 










150 


Ala 


Asn 


Gly 


Glu 


Glu 


Val 


Leu 


Ala 


Thr 


Ser 


Thr 


His 


Ser 


Lys 


Val 








155 










160 










165 



183 




PF-0459 US 



Leu 


Tyr 


Val 


Tyr 


Asp 


Met 


Leu 


Ala 


Gly 


Lys 


Leu 


He 


Pro 


Val 


His 






170 










175 










180 


Gin 


Val 


Arg 


Gly 


Leu 


Lys 


Glu 


Lys 


He 


Val 


Arg 


Ser 


Phe 


Glu 


Val 






185 










190 










195 


Ser 


Pro 


Asp 


Gly 


Ser 


Phe 


Leu 


Leu 


He 


Asn 


Gly 


He 


Ala 


Gly 


Tyr 






200 










205 










210 


Leu 


His 


Leu 


Leu 


Ala 


Met 


Lys 


Thr 


Lys 


Glu 


Leu 


He 


Gly 


Ser 


Met 










215 










220 










225 


Lys 


He 


Asn 


Gly 


Arg 


Val 


Ala 


Ala 


Ser 


Thr 


Phe 


Ser 


Ser 


Asp 


Ser 






230 










235 










240 


Lys 


Lys 


Val 


Tyr 


Ala 


Ser 


Ser 


Gly 


Asp 


Gly 


Glu 


Val 


Tyr 


Val 


Trp 




245 










250 










255 


Asp 


Val 


Asn 


Ser 


Arg 


Lys 


Cys 


Leu 


Asn 


Arg 


Phe 


Val 


Asp 


Glu 


Gly 








260 










265 










270 


Ser 


Leu 


Tyr 


Gly 


Leu 


Ser 


He 


Ala 


Thr 


Ser 


Arg 


Asn 


Gly 


Gin 


Tyr 






275 










280 










285 


Val 


Ala 


Cys 


Gly 


Ser 


Asn 


Cys 


Gly 


Val 


Val 


Asn 


He 


Tyr 


Asn 


Gin 






290 










295 










300 


Asp 


Ser 


Cys 


Leu 


Gin 


Glu 


Thr 


Asn 


Pro 


Lys 


Pro 


He 


Lys 


Ala 


He 






305 










310 










315 


Met 


Asn 


Leu 


Val 


Thr 


Gly 


Val 


Thr 


Ser 


Leu 


Thr 


Phe 


Asn 


Pro 


Thr 










320 








325 










330 


Thr 


Glu 


He 


Leu 


Ala 


He 


Ala 


Ser 


Glu 


Lys 


Met 


Lys 


Glu 


Ala 


Val 










335 










340 










345 


Arg 


Leu 


Val 


His 


Leu 


Pro 


Ser 


Cys 


Thr 


Val 


Phe 


Ser 


Asn 


Phe 


Pro 








350 










355 










360 


Val 


He 


Lys 


Asn 


Lys 


Asn 


He 


Ser 


His 


Val 


His 


Thr 


Met 


Asp 


Phe 








365 










370 










375 


Ser 


Pro 


Arg 


Ser 


Gly 


Tyr 


Phe 


Ala 


Leu 


Gly 


Asn 


Glu 


Lys 


Gly 


Lys 








380 










385 










o y u 


Ala 


Leu 


Met 


Tyr 


Arg 


Leu 


His 


His 


Tyr 


Ser 


Asp 


Phe 
















395 










400 












(2) 


INFORMATION 


FOR 


SEQ 


ID 


NO: 




72: 













(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT19 

(B) CLONE: 3094321 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 : 



Met 


Ala 


Leu 


Ser 


Arg 


Gly 


Leu 


Pro 


Arg 


Glu 


Leu 


Ala 


Glu 


Ala 


Val 










5 








10 










15 


Ala 


Gly 


Gly 


Arg 


Val 


Leu 


Val 


Val 


Gly 


Ala 


Gly 


Gly 


He 


Gly 


Cys 






20 










25 










30 


Glu 


Leu 


Leu 


Lys 


Asn 


Leu 


Val 


Leu 


Thr 


Gly 


Phe 


Ser 


His 


He 


Asp 








35 










40 










45 


Leu 


He 


Asp 


Leu 


Asp 


Thr 


He 


Asp 


Val 


Ser 


Asn 


Leu 


Asn 


Arg 


Gin 








50 










55 










60 


Phe 


Leu 


Phe 


Gin 


Lys 


Lys 


His 


Val 


Gly 


Arg 


Ser 


Lys 


Ala 


Gin 


Val 










65 










70 










75 


Ala 


Lys 


Glu 


Ser 


Val 


Leu 


Gin 


Phe 


Tyr 


Pro 


Lys 


Ala 


Asn 


He 


Val 



184 
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80 










85 










90 


Ala 


Tyr 


His 


Asp 


Ser 


He 


Met 


Asn 


Pro 


Asp 


Tyr 


Asn 


Val 


Glu 


Phe 






95 










100 










105 


Phe 


Arg 


Gin 


Phe 


He 


Leu 


Val 


Met 


Asn 


Ala 


Leu 


Asp 


Asn 


Arg 


Ala 








110 










115 










120 


Ala 


Arg 


Asn 


His 


Val 


Asn 


Arg 


Met 


c y s 


Leu 


Ala 


Ala 


Asp 


Val 


Pro 








125 










130 










135 


Leu 


He 


Glu 


Ser 


Gly 


Thr 


Ala 


Gly 


Tyr 


Leu 


Gly 


Gin 


Val 


Thr 


Thr 










140 










145 










150 


lie 


Lys 


Lys 


Gly 


Val 


Thr 


Glu 


Cys 


Tyr 


Glu 


Cys 


His 


Pro 


Lys 


Pro 




155 










160 










165 


Thr 


Gin 


Arg 


Thr 


Phe 


Pro 


Gly 


Cys 


Thr 


He 


Arg 


Asn 


Thr 


Pro 


Ser 








170 










175 










180 


Glu 


Pro 


He 


His 


Cys 


He 


Val 


Trp 


Ala 


Lys 


Tyr 


Leu 


Phe 


Asn 


Gin 










185 










190 










195 


Leu 


Phe 


Gly 


Glu 


Glu 


Asp 


Ala 


Asp 


Gin 


Glu 


Val 


Ser 


Pro 


Asp 


Arg 








200 










205 










210 


Ala 


Asp 


Pro 


Glu 


Ala 


Ala 


Trp 


Glu 


Pro 


Thr 


Glu 


Ala 


Glu 


Ala 


Arg 








215 










220 










225 


Ala 


Arg 


Ala 


Ser 


Asn 


Glu 


Asp 


Gly 


Asp 


He 


Lys 


Arg 


He 


Ser 


Thr 








230 










235 










240 


Lys 


Glu 


Trp 


Ala 


Lys 


Ser 


Thr 


Gly 


Tyr 


Asp 


Pro 


Val 


Lys 


Leu 


Phe 






245 










250 










255 


Thr 


Lys 


Leu 


Phe 


Lys 


Asp 


Asp 


He 


Arg 


Tyr 


Leu 


Leu 


Thr 


Met 


Asp 








260 










265 










270 


Lys 


Leu 


Trp 


Arg 


Lys 


Arg 


Lys 


Pro 


Pro 


Val 


Pro 


Leu 


Asp 


Trp 


Ala 








275 










280 










285 


Glu 


Val 


Gin 


Ser 


Gin 


Gly 


Glu 


Glu 


Thr 


Asn 


Ala 


Ser 


Asp 


Gin 


Gin 










290 








295 










300 


Asn 


Glu 


Pro 


Gin 


Leu 


Gly 


Leu 


Lys 


Asp 


Gin 


Gin 


Val 


Leu 


Asp 


Val 










305 










310 










315 


Lys 


Ser 


Tyr 


Ala 


Arg 


Leu 


Phe 


Ser 


Lys 


Ser 


He 


Glu 


Thr 


Leu 


Arg 






320 










325 










330 


Val 


His 


Leu 


Ala 


Glu 


Lys 


Gly Asp 


Gly Ala 


Glu 


Leu 


He 


Trp 


Asp 










335 










340 










345 


Lys 


Asp 


Asp 


Pro 


Ser 


Ala 


Met 


Asp 


Phe 


Val 


Thr 


Ser 


Ala 


Ala 


Asn 




350 










355 










360 


Leu 


Arg 


Met 


His 


He 


Phe 


Ser 


Met 


Asn 


Met 


Lys 


Ser 


Arg 


Phe 


Asp 








365 










370 










375 


He 


Lys 


Ser 


Met 


Ala 


Gly 


Asn 


He 


He 


Pro 


Ala 


He 


Ala 


Thr 


Thr 








380 










385 










390 


Asn 


Ala 


Val 


He 


Ala 


Gly 


Leu 


He 


Val 


Leu 


Glu 


Gly 


Leu 


Lys 


He 










395 








400 










405 


Leu 


Ser 


Gly 


Lys 


He 


Asp 


Gin 


Cys 


Arg 


Thr 


He 


Phe 


Leu 


Asn 


Lys 






410 










415 










420 


Gin 


Pro 


Asn 


Pro 


Arg 


Lys 


Lys 


Leu 


Leu 


Val 


Pro 


Cys 


Ala 


Leu 


Asp 










425 










430 










435 


Pro 


Pro 


Asn 


Pro 


Asn 


Cys 


Tyr 


Val 


Cys 


Ala 


Ser 


Lys 


Pro 


Glu 


Val 










440 








445 










450 


Thr 


Val 


Arg 


Leu 


Asn 


Val 


His 


Lys 


Val 


Thr 


Val 


Leu 


Thr 


Leu 


Gin 








455 










460 










465 


Asp 


Lys 


He 


Val 


Lys 


Glu 


Lys 


Phe 


Ala 


Met 


Val 


Ala 


Pro 


Asp 


Val 






470 










475 










480 


Gin 


He 


Glu 


Asp 


Gly 


Lys 


Gly 


Thr 


He 


Leu 


He 


Ser 


Ser 


Glu 


Glu 








485 










490 










495 


Gly 


Glu 


Thr 


Glu 


Ala 


Asn 


Asn 


His 


Lys 


Lys 


Leu 


Ser 


Glu 


Phe 


Gly 








500 










505 










510 


He 


Arg 


Asn 


Gly 


Ser 


Arg 


Leu 


Gin 


Ala 


Asp 


Asp 


Phe 


Leu 


Gin 


Asp 






515 










520 










525 


Tyr 


Thr 


Leu 


Leu 


He 


Asn 


He 


Leu 


His 


Ser 


Glu 


Asp 


Leu 


Gly 


Lys 








530 










535 










540 



185 
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Asp 


Val 


Glu 


Phe 


Glu 
545 


Val 


Val 


Gly 


Asp 


Ala 
550 


Pro 


Glu 


Lys 


Val 


Gly 
555 


Pro 


Lys 


Gin 


Ala 


Glu 
560 


Asp 


Ala 


Ala 


Lys 


Ser 
565 


He 


Thr 


Asn 


Gly 


Ser 
570 


Asp 


Asp 


Gly 


Ala 


Gin 
575 


Pro 


Ser 


Thr 


Ser 


Thr 
580 


Ala 


Gin 


Glu 


Gin 


Asp 
585 


Asp 


Val 


Leu 


He 


Val 
590 


Asp 


Ser 


Asp 


Glu 


Glu 
595 


Asp 


Ser 


Ser 


Asn 


Asn 
600 


Ala 


Asp 


Val 


Ser 


Glu 
605 


Glu 


Glu 


Arg 


Ser 


Arg 
610 


Lys 


Arg 


Lys 


Leu 


Asp 
615 


Glu 


Lys 


Glu 


Asn 


Leu 
620 


Ser 


Ala 


Lys 


Arg 


Ser 
625 


Arg 


He 


Glu 


Gin 


Lys 
630 


Glu 


Glu 


Leu 


Asp 


Asp 
635 


Val 


He 


Ala 


Leu 


Asp 
640 













(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT13 

(B) CLONE: 3115936 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73 : 



Met 


Asp 


Lys 


He 


Leu 


Asn 


Val 


Glu 


Glu 


Thr 


Tyr 


Leu 


Thr 


Val 


Leu 






5 










10 










15 


Val 


Lys 


He 


Gly 


Pro 


Gly 


Phe 


His 


Thr 


Arg 


Glu 


Cys 


Phe 


Leu 


Leu 






20 










25 










30 


Lys 


Ser 


He 


Leu 


Cys 


Phe 


Ser 


Pro 


Ser 


Tyr 


Arg 


Met 


Ser 


Glu 


Gly 








35 










40 










45 


Asp 


Ser 


Val 


Gly 


Glu 


Ser 


Val 


His 


Gly 


Lys 


Pro 


Ser 


Val 


Val 


Tyr 






50 










55 










60 


Arg 


Phe 


Phe 


Thr 


Arg 


Leu 


Gly 


Gin 


He 


Tyr 


Gin 


Ser 


Trp 


Leu 


Asp 








65 










70 










75 


Lys 


Ser 


Thr 


Pro 


Tyr 


Thr 


Ala 


Val 


Arg 


Trp 


Val 


Val 


Thr 


Leu 


Gly 








80 










85 










90 


Leu 


Ser 


Phe 


Val 


Tyr 
95 


Met 


He 


Arg 


Val 


Tyr 
100 


Leu 


Leu 


Gin 


Gly 


Trp 
105 


Tyr 


He 


Val 


Thr 


Tyr 


Ala 


Leu 


Gly 


He 


Tyr 


His 


Leu 


Asn 


Leu 


Phe 








110 










115 










120 


He 


Ala 


Phe 


Leu 


Ser 
125 


Pro 


Lys 


Val 


Asp 


Pro 
130 


Ser 


Leu 


Met 


Glu 


Asp 
135 


Ser 


Asp 


Asp 


Gly 


Pro 


Ser 


Leu 


Pro 


Thr 


Lys 


Gin 


Asn 


Glu 


Glu 


Phe 




140 










145 










150 


Arg 


Pro 


Phe 


He 


Arg 


Arg 


Leu 


Pro 


Glu 


Phe 


Lys 


Phe 


Trp 


His 


Ala 








155 










160 










165 


Ala 


Thr 


Lys 


Gly 


He 


Leu 


Val 


Ala 


Met 


Val 


Cys 


Thr 


Phe 


Phe 


Asp 






170 










175 










180 


Ala 


Phe 


Asn 


Val 


Pro 


Val 


Phe 


Trp 


Pro 


He 


Leu 


Val 


Met 


Tyr 


Phe 










185 








190 










195 


He 


Met 


Leu 


Phe 


Cys 
200 


He 


Thr 


Met 


Lys 


Arg 
205 


Gin 


He 


Lys 


His 


Met 
210 


He 


Lys 


Tyr 


Arg 


Tyr 


He 


Pro 


Phe 


Thr 


His 


Gly 


Lys 


Arg 


Arg 


Tyr 



186 
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215 220 
Arg Gly Lys Glu Asp Ala Gly Lys Ala Phe Ala Ser 
230 235 



225 



(2) INFORMATION FOR SEQ ID NO: 



74 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT13 

(B) CLONE: 3116522 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74 : 



Met 


Asp 


Ala 


Arg 


Trp 


Trp 


Ala 


Val 


Val 


Val 


Leu 


Ala 


Ala 


Phe 


Pro 








5 










10 










15 


Ser 


Leu 


Gly 


Ala 


Glv 


Glv 


Glu 


Thr 


Pro 


Glu 


Ala 


Pro 


Pro 


Glu 


Ser 








20 










25 










30 


Trr> 


Thr 


Gin 


Leu 


Trp 


Phe 


Phe 


Arg 


Phe 


Val 


Val 


Asn 


Ala 


Ala 


Gly 








35 










40 










45 


Tvr 


Ala 


Ser 


Phe 


Met 


Val 


Pro 


Gly 


Tyr 


Leu 


Leu 


Val 


Gin 


Tyr 


Phe 








50 










55 










60 


Arg 


Arg 


Lys 


Asn 


Tyr 


Leu 


Glu 


Thr 


Gly 


Arg 


Gly 


Leu 


Cys 


Phe 


Pro 




65 










70 










75 


Leu 


Val 


Lys 


Ala 


Cys 


Val 


Phe 


Gly 


Asn 


Glu 


Pro 


Lys 


Ala 


Ser 


Asp 








80 










85 










90 


Glu 


Val 


Pro 


Leu 


Ala 
95 


Pro 


Arg 


Thr 


Glu 


Ala 
100 


Ala 


Glu 


Thr 


Thr 


Pro 
105 


Met 


Trp 


Gin 


Ala 


Leu 


Lys 


Leu 


Leu 


Phe 


Cys 


Ala 


Thr 


Gly 


Leu 


Gin 








110 








115 










120 


Val 


Ser 


Tyr 


Leu 


Thr 


Trp 


Gly 


Val 


Leu 


Gin 


Glu 


Arg 


Val 


Met 


Thr 








125 










130 










135 


Arg 


Ser 


Tyr 


Gly 


Ala 


Thr 


Ala 


Thr 


Ser 


Pro 


Gly 


Glu 


Arg 


Phe 


Thr 




140 










145 










150 


Asp 


Ser 


Gin 


Phe 


Leu 


Val 


Leu 


Met 


Asn 


Arg 


Val 


Leu 


Ala 


Leu 


He 








155 










160 










165 


Val 


Ala 


Gly 


Leu 


Ser 


Cys 


Val 


Leu 


Cys 


Lys 


Gin 


Pro 


Arg 


His 


Gly 








170 








175 










180 


Ala 


Pro 


Met 


Tyr 


Arg 


Tyr 


Ser 


Phe 


Ala 


Ser 


Leu 


Ser 


Asn 


Val 


Leu 








185 










190 










195 


Ser 


Ser 


Trp 


Cys 


Gin 


Tyr 


Glu 


Ala 


Leu 


Lys 


Phe 


Val 


Ser 


Phe 


Pro 






200 










205 










210 


Thr 


Gin 


Val 


Leu 


Ala 


Lys 


Ala 


Ser 


Lys 


Val 


He 


Pro 


Val 


Met 


Leu 










215 








220 










225 


Met 


Gly 


Lys 


Leu 


Val 


Ser 


Arg 


Arg 


Ser 


Tyr 


Glu 


His 


Trp 


Glu 


Tyr 






230 










235 










240 


Leu 


Thr 


Ala 


Thr 


Leu 

245 


He 


Ser 


He 


Gly 


Val 
250 


Ser 


Met 


Phe 


Leu 


Leu 
255 


Ser 


Ser 


Gly 


Pro 


Glu 


Pro 


Arg 


Ser 


Ser 


Pro 


Ala 


Thr 


Thr 


Leu 


Ser 








260 










265 










270 


Gly 


Leu 


He 


Leu 


Leu 


Ala 


Gly 


Tyr 


He 


Ala 


Phe 


Asp 


Ser 


Phe 


Thr 








275 










280 










285 


Ser 


Asn 


Trp 


Gin 


Asp 


Ala 


Leu 


Phe 


Ala 


Tyr 


Lys 


Met 


Ser 


Ser 


Val 



290 



295 



300 



187 
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Gin 


Met 


Met 


Phe 


Gly 
305 


Val 


Asn 


Phe 


Gly 


Ser 


Leu 


Leu 


Glu 


Gin 


Gly 


Ala 








320 








Met 


Gly 


Arg 


His 


Ser 
335 


Glu 


Phe 


Ala 


He 


Cys 


Ser 


Ala 


Cys 
350 


Gly 


Gin 


Leu 


Gin 


Phe 


Gly 


Ala 


Ala 
365 


Val 


Phe 


Thr 


Ala 


Phe 


Ala 


He 


Leu 
380 


Leu 


Ser 


Cys 


Thr 


Val 


Val 


Gly 


Gly 
395 


Leu 


Gly 


Val 


Leu 


Leu 


Arg 


Val 


Tyr 
410 


Ala 


Arg 


Gly 


Lys 


Ala 


Val 


Pro 


Val 


Glu 


Ser 


Pro 








425 










Phe 


Ser 


Cys 


Leu 


Phe 


Thr 


Val 




310 








315 


Leu 


Leu 
325 


Glu 


Gly 


Thr 


Arg 


Phe 
330 


Ala 


His 
340 


Ala 


Leu 


Leu 


Leu 


Ser 
345 


Phe 


He 
355 


Phe 


Tyr 


Thr 


He 


Gly 
360 


He 


He 
370 


Met 


Thr 


Leu 


Arg 


Gin 
375 


Leu 


Leu 
385 


Tyr 


Gly 


His 


Thr 


Val 
390 


Ala 


Val 

400 


Val 


Phe 


Ala 


Ala 


Leu 
405 


Arg 


Leu 
415 


Lys 


Gin 


Arg 


Gly 


Lys 
420 


Val 


Gin 
430 


Lys 


Val 









(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT13 

(B) CLONE: 3117184 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 : 



Met 


Ser 


Phe 


Pro 


Pro 
5 


His 


Leu 


Asn 


Arg 


Pro 
10 


Pro 


Met 


Gly 


He 


Pro 
15 


Ala 


Leu 


Pro 


Pro 


Gly 
20 


Thr 


Pro 


Pro 


Pro 


Gin 
25 


Phe 


Pro 


Gly 


Phe 


Pro 
30 


Pro 


Pro 


Val 


Pro 


Pro 

35 


Gly 


Thr 


Pro 


Met 


He 
40 


Pro 


Val 


Pro 


Met 


Ser 
45 


He 


Met 


Ala 


Pro 


Ala 
50 


Pro 


Thr 


Val 


Leu 


Val 
55 


Pro 


Thr 


Val 


Ser 


Met 
60 


Val 


Gly 


Lys 


His 


Leu 


Gly 


Ala 


Arg 


Lys 


Asp 


His 


Pro 


Gly 


Leu 


Lys 






65 










70 










75 


Ala 


Lys 


Glu 


Asn 


Asp 


Glu 


Asn 


Cys 


Gly 


Pro 


Thr 


Thr 


Thr 


Val 


Phe 








80 










85 










90 


Val 


Gly 


Asn 


He 


Ser 


Glu 


Lys 


Ala 


Ser 


Asp 


Met 


Leu 


He 


Arg 


Gin 








95 










100 










105 


Leu 


Leu 


Ala 


Lys 


Cys 


Gly 


Leu 


Val 


Leu 


Ser 


Trp 


Lys 


Arg 


Val 


Gin 








110 










115 










120 


Gly Ala 


Ser 


Gly 


Lys 


Leu 


Gin 


Ala 


Phe 


Gly 


Phe 


Cys 


Glu 


Tyr 


Lys 








125 










130 










135 


Glu 


Pro 


Glu 


Ser 


Thr 
140 


Leu 


Arg 


Ala 


Leu 


Arg 
145 


Leu 


Leu 


His 


Asp 


Leu 
150 


Gin 


He 


Gly 


Glu 


Lys 


Lys 


Leu 


Leu 


Val 


Lys 


Val 


Asp 


Ala 


Lys 


Thr 








155 










160 










165 


Lys 


Ala 


Gin 


Leu 


Asp 


Glu 


Trp 


Lys 


Ala 


Lys 


Lys 


Lys 


Ala 


Ser 


Asn 








170 










175 










180 


Gly 


Asn 


Ala 


Arg 


Pro 


Glu 


Thr 


Val 


Thr 


Asn 


Asp 


Asp 


Glu 


Glu 


Ala 



188 
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185 










190 










195 


Leu 


Asp 


Glu 


Glu 


Thr 
200 


Lys 


Arg 


Arg 


Asp 


Gin 
205 


Met 


He 


Lys 


Gly 


Ala 
210 


He 


Glu 


Val 


Leu 


He 
215 


Arg 


Glu 


Tyr 


Ser 


Ser 
220 


Glu 


Leu 


Asn 


Ala 


Pro 

225 


Ser 


Gin 


Glu 


Ser 


Asp 
230 


Ser 


His 


Pro 


Arg 


Lys 
235 


Lys 


Lys 


Lys 


Glu 


Lys 
240 


Lys 


Glu 


Asp 


He 


Phe 
245 


Gly 


Arg 


Phe 


Gin 


Trp 
250 


Ala 


His 









(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 523 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LNODNOT05 

(B) CLONE: 3125156 





(xi) SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 76 : 








Met 


Gly 


Pro 


Gin 


Ala 


Ala 


Pro 


Leu 


Thr 


He 


Arg 


Gly 


Pro 


Ser 


Ser 








5 










10 










15 


Ala 


Gly 


Gin 


Ser 


Thr 


Pro 


Ser 


Pro 


His 


Leu 


Val 


Pro 


Ser 


Pro 


Ala 








20 










25 










30 


Pro 


Ser 


Pro 


Gly 


Pro 


Gly 


Pro 


Val 


Pro 


Pro 


Arg 


Pro 


Pro 


Ala 


Ala 








35 










40 










45 


Glu 


Pro 


Pro 


Pro 


Cys 


Leu 


Arg 


Arg 


Gly 


Ala 


Ala 


Ala 


Ala 


Asp 


Leu 










50 










55 










60 


Leu 


Ser 


Ser 


Ser 


Pro 


Glu 


Ser 


Gin 


His 


Gly 


Gly 


Thr 


Gin 


Ser 


Pro 










65 










70 










75 


Gly 


Gly 


Gly 


Gin 


Pro 


Leu 


Leu 


Gin 


Pro 


Thr 


Lys 


Val 


Asp 


Ala 


Ala 




80 










85 










90 


Glu 


Gly Arg 


Arg 


Pro 


Gin 


Ala 


Leu 


Arg 


Leu 


He 


Glu 


Arg 


Asp 


Pro 








95 










100 










105 


Tyr 


Glu 


His 


Pro 


Glu 


Arg 


Leu 


Arg 


Gin 


Leu 


Gin 


Gin 


Glu 


Leu 


Glu 








110 










115 










120 


Ala 


Phe 


Arg 


Gly 


Gin 


Leu 


Gly 


Asp 


Val 


Gly 


Ala 


Leu 


Asp 


Thr 


Val 






125 










130 










135 


Trp 


Arg 


Glu 


Leu 


Gin 


Asp 


Ala 


Gin 


Glu 


His 


Asp 


Ala 


Arg 


Gly 


Arg 






140 










145 










150 


Ser 


He 


Ala 


He 


Ala 


Arg 


Cys 


Tyr 


Ser 


Leu 


Lys 


Asn 


Arg 


His 


Gin 










155 










160 










165 


Asp 


Val 


Met 


Pro 


Tyr 


Asp 


Ser 


Asn 


Arg 


Val 


Val 


Leu 


Arg 


Ser 


Gly 








170 










175 










180 


Lys 


Asp 


Asp 


Tyr 


He 


Asn 


Ala 


Ser 


Cys 


Val 


Glu 


Gly 


Leu 


Ser 


Pro 






185 










190 










195 


Tyr 


Cys 


Pro 


Pro 


Leu 


Val 


Ala 


Thr 


Gin 


Ala 


Pro 


Leu 


Pro 


Gly 


Thr 






200 










205 










210 


Ala 


Ala 


Asp 


Phe 


Trp 


Leu 


Met 


Val 


His 


Glu 


Gin 


Lys 


Val 


Ser 


Val 








215 










220 










225 


He 


Val 


Met 


Leu 


Val 


Ser 


Glu 


Ala 


Glu 


Met 


Glu 


Lys 


Gin 


Lys 


Val 










230 










235 










240 


Ala 


Arg 


Tyr 


Phe 


Pro 


Thr 


Glu 


Arg 


Gly 


Gin 


Pro 


Met 


Val 


His 


Gly 






245 










250 










255 
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Ala 


Leu 


Ser 


Leu 


Ala 


Leu 


Ser 


Ser 


Val 


Arg 


Ser 


Thr 


Glu 


Thr 


His 










260 










265 










270 


Val 


Glu 


Arg 


Val 


Leu 


Ser 


Leu 


Gin 


Phe 


Arg 


Asp 


Gin 


Ser 


Leu 


Lys 








275 










280 










285 


Arg 


Ser 


Leu 


Val 


His 


Leu 


His 


Phe 


Pro 


Thr 


Trp 


Pro 


Glu 


Leu 


Gly 








290 










295 










300 


Leu 


Pro 


Asp 


Ser 


Pro 


Ser 


Asn 


Leu 


Leu 


Arg 


Phe 


He 


Gin 


Glu 


Val 








305 










310 










315 


His 


Ala 


His 


Tyr 


Leu 


His 


Gin 


Arg 


Pro 


Leu 


His 


Thr 


Pro 


He 


He 








320 










325 










330 


Val 


His 


Cys 


Ser 


Ser 


Gly 


Val 


Gly 


Arg 


Thr 


Gly 


Ala 


Phe 


Ala 


Leu 








335 










340 










345 


Leu 


Tyr 


Ala 


Ala 


Val 


Gin 


Glu 


Val 


Glu 


Ala 


Gly 


Asn 


Gly 


He 


Pro 








350 










355 










360 


Glu 


Leu 


Pro 


Gin 


Leu 


Val 


Arg 


Arg 


Met 


Arg 


Gin 


Gin 


Arg 


Lys 


His 










365 










370 










375 


Met 


Leu 


Gin 


Glu 


Lys 


Leu 


His 


Leu 


Arg 


Phe 


Cys 


Tyr 


Glu 


Ala 


Val 










380 










385 










390 


Val 


Arg 


His 


Val 


Glu 


Gin 


Val 


Leu 


Gin 


Arg 


His 


Gly 


Val 


Pro 


Pro 








395 










400 










405 


Pro 


Cys 


Lys 


Pro 


Leu 


Ala 


Ser 


Ala 


Ser 


He 


Ser 


Gin 


Lys 


Asn 


His 






410 










415 










420 


Leu 


Pro 


Gin 


Asp 


Ser 


Gin 


Asp 


Leu 


Val 


Leu 


Gly 


Gly 


Asp 


Val 


Pro 








425 










430 










435 


He 


Ser 


Ser 


He 


Gin 


Ala 


Thr 


He 


Ala 


Lys 


Leu 


Ser 


He 


Arg 


Pro 










440 










445 










450 


Pro 


Gly 


Gly 


Leu 


Glu 


Ser 


Pro 


Val 


Ala 


Ser 


Leu 


Pro 


Gly 


Pro 


Ala 






455 










460 










4 65 


Glu 


Pro 


Pro 


Gly 


Leu 


Pro 


Pro 


Ala 


Ser 


Leu 


Pro 


Glu 


Ser 


Thr 


Pro 








470 










475 










480 


He 


Pro 


Ser 


Ser 


Ser 


Gin 


Thr 


Pro 


Phe 


Pro 


Pro 


His 


Tyr 


Leu 


Arg 










485 










490 










495 


Leu 


Pro 


Ser 


Leu 


Arg 


Arg 


Ser 


Arg 


Gin 


Cys 


Leu 


Lys 


Pro 


Pro 


Ala 










500 










505 










510 


Arg 


Gly 


Pro 


Pro 


Pro 


Pro 


Pro 


Trp 


Asn 


Cys 


Trp 


Pro 


Pro 







515 520 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 621 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT12 

(B) CLONE: 3129120 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 : 

Met Gly Leu Leu Ser Asp Pro Val Arg Arg Arg Ala Leu Ala Arg 

5 10 15 

Leu Val Leu Arg Leu Asn Ala Pro Leu Cys Val Leu Ser Tyr Val 

20 25 30 

Ala Gly He Ala Trp Phe Leu Ala Leu Val Phe Pro Pro Leu Thr 

35 40 45 

Gin Arg Thr Tyr Met Ser Glu Asn Ala Met Gly Ser Thr Met Val 

50 55 60 
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Glu Glu Gin Phe Ala Gly Gly Asp Arg Ala Arg Ala Phe Ala Arg 
65 70 75 

Asp Phe Ala Ala His Arg Lys Lys Ser Gly Ala Leu Pro Val Ala 
80 85 90 

Trp Leu Glu Arg Thr Met Arg Ser Val Gly Leu Glu Val Tyr Thr 
95 100 105 

Gin Ser Phe Ser Arg Lys Leu Pro Phe Pro Asp Glu Thr His Glu 
110 H5 120 

Arq Tyr Met Val Ser Gly Thr Asn Val Tyr Gly He Leu Arg Ala 
125 130 135 

Pro Arq Ala Ala Ser Thr Glu Ser Leu Val Leu Thr Val Pro Cys 
140 145 150 

Gly Ser Asp Ser Thr Asn Ser Gin Ala Val Gly Leu Leu Leu Ala 
155 160 165 

Leu Ala Ala His Phe Arg Gly Gin He Tyr Trp Ala Lys Asp He 
170 175 180 

Val Phe Leu Val Thr Glu His Asp Leu Leu Gly Thr Glu Ala Trp 
185 190 195 

Leu Glu Ala Tyr His Asp Val Asn Val Thr Gly Met Gin Ser Ser 
200 205 210 

Pro Leu Gin Gly Arg Ala Gly Ala He Gin Ala Ala Val Ala Leu 
215 220 225 

Glu Leu Ser Ser Asp Val Val Thr Ser Leu Asp Val Ala Val Glu 
230 235 240 

Gly Leu Asn Gly Gin Leu Pro Asn Leu Asp Leu Leu Asn Leu Phe 
245 250 255 

Gin Thr Phe Cys Gin Lys Gly Gly Leu Leu Cys Thr Leu Gin Gly 
260 265 270 

Lys Leu Gin Pro Glu Asp Trp Thr Ser Leu Asp Gly Pro Leu Gin 
275 280 285 

Gly Leu Gin Thr Leu Leu Leu Met Val Leu Arg Gin Ala Ser Gly 
290 295 300 

Arq Pro His Gly Ser His Gly Leu Phe Leu Arg Tyr Arg Val Glu 
305 310 315 

Ala Leu Thr Leu Arg Gly He Asn Ser Phe Arg Gin Tyr Lys Tyr 
320 325 330 

Asp Leu Val Ala Val Gly Lys Ala Leu Glu Gly Met Phe Arg Lys 
335 340 345 

Leu Asn His Leu Leu Glu Arg Leu His Gin Ser Phe Phe Leu Tyr 
350 355 360 

Leu Leu Pro Gly Leu Ser Arg Phe Val Ser He Gly Leu Tyr Met 
365 370 375 

Pro Ala Val Gly Phe Leu Leu Leu Val Leu Gly Leu Lys Ala Leu 
380 385 390 

Glu Leu Trp Met Gin Leu His Glu Ala Gly Met Gly Leu Glu Glu 
395 400 405 

Pro Glv Gly Ala Pro Gly Pro Ser Val Pro Leu Pro Pro Ser Gin 
410 415 420 

Glv Val Gly Leu Ala Ser Leu Val Ala Pro Leu Leu He Ser Gin 
425 - 430 435 

Ala Met Gly Leu Ala Leu Tyr Val Leu Pro Val Leu Gly Gin His 
440 445 450 

Val Ala Thr Gin His Phe Pro Val Ala Glu Ala Glu Ala Val Val 
455 460 465 

Leu Thr Leu Leu Ala He Tyr Ala Ala Gly Leu Ala Leu Pro His 
470 475 480 

Asn Thr His Arg Val Val Ser Thr Gin Ala Pro Asp Arg Gly Trp 
485 490 495 

Met Ala Leu Lys Leu Val Ala Leu lie Tyr Leu Ala Leu Gin Leu 
500 505 510 

Gly Cys He Ala Leu Thr Asn Phe Ser Leu Gly Phe Leu Leu Ala 
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515 








Thr 


Thr 


Met 


Val 


Pro 
530 


Thr 


Ala 


Ala 


Arg 


Thr 


Leu 


Tyr 


Ala 


Ala 


Leu 


Leu 








545 








Thr 


Leu 


Leu 


Gly 


Ser 
560 


Leu 


Phe 


Leu 


Pro 


Leu 


Ser 


Leu 


Ala 

575 


Glu 


Gly 


Trp 


Ala 


Gin 


Gly 


Val 


Leu 
590 


Glu 


His 


His 


Pro 


Leu 


Leu 


Ser 


Leu 
605 


Gly 


Leu 


Tyr 


Asn 


Val 


Leu 


Phe 


Trp 
620 


Lys 









520 










525 


Leu 


Ala 


Lys 


Pro 


His 


Gly 


Pro 




535 










540 


Val 


Leu 


Thr 


Ser 


Pro 


Ala 


Ala 




550 










555 


Trp 


Arg 


Glu 


Leu 


Gin 


Glu 


Ala 


565 










570 


Gin 


Leu 


Phe 


Leu 


Ala 


Ala 


Leu 




580 










585 


Thr 


Tyr 


Gly 


Ala 


Leu 


Leu 


Phe 




595 










600 


Pro 


Cys 


Trp 


Leu 


Leu 


Phe 


Trp 




610 










615 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2347 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEARNOT0 1 

(B) CLONE: 305841 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 : 
CCCTCGAGAA GATGGCGGCG ACTCTGGGAC CCCTTGGGTC GTGGCAGCAG TGGCGGCGAT 60 
GTTTGTCGGC TCGGGATGGG TCCAGGATGT TACTCCTTCT TCTTTTGTTG GGGTCTGGGC 120 
AGGGGCCACA GCAAGTCGGG GCGGGTCAAA CGTTCGAGTA CTTGAAACGG GAGCACTCGC 18 0 
TGTCGAAGCC CTACCAGGGT GTGGGCACAG GCAGTTCCTC ACTGTGGAAT CTGATGGGCA 240 
ATGCCATGGT GATGACCCAG TATATCCGCC TTACCCCAGA TATGCAAAGT AAACAGGGTG 300 
CCTTGTGGAA CCGGGTGCCA TGTTTCCTGA GAGACTGGGA GTTGCAGGTG CACTTCAAAA 360 
TCCATGGACA AGGAAAGAAG AATCTGCATG GGGATGGCTT GGCAATCTGG TACACAAAGG 4 20 
ATCGGATGCA GCCAGGGCCT GTGTTTGGAA ACAT GGACAA ATTTGTGGGG CTGGGAGTAT 4 80 
TTGTAGACAC CTACCCCAAT GAGGAGAAGC AGCAAGAGCG GGTATTCCCC TACATCTCAG 54 0 
CCATGGTGAA CAACGGCTCC CTCAGCTATG AT CAT GAG CG GGATGGGCGG CCTACAGAGC 600 
TGGGAGGCTG CACAGCCATT GTCCGCAATC TTCATTACGA CACCTTCCTG GTGATTCGCT 660 
ACGTCAAGAG GCATTTGACG ATAATGATGG ATATTGATGG CAAGCATGAG TGGAGGGACT 720 
GCATTGAAGT GCCCGGAGTC CGCCTGCCCC GCGGCTACTA CTTCGGCACC TCCTCCATCA 780 
CTGGGGATCT CTCAGATAAT CATGATGTCA TTTCCTTGAA GTTGTTTGAA CTGACAGTGG 8 40 
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AGAGAACCCC AGAAGAGGAA AAGCTCCATC 
TGAAGCTGCC TGAGATGACA GCTCCACTGC 
TCGTCTTTTT CTCCCTGGTG TTTTCTGTAT 
ACAAATGGCA GGAACAGAGC CGAAAGCGCT 
TGTGACTGTC ACCCATGAGG TATGGAAGGA 
GAGTGTTCTT GTCTCTAGCA GCTGGTTGGG 
CAGGGACCCC GCATTCCCAT GGTTGTGCAT 
CCCACCCCAG GGCAATGCTG CTGTGATGTG 
AGAGGTGTGA AGAGAATTTA CGTGGTTGTG 
CCCAGGCTGC CGTGTTGTTT GACTCAGAAG 
GAATTAAAAA CTGGTAACAC CACAGGCTTT 
GACCCAACCC TCTGCCTACC TGAGGAGCTT 
CTGCCTTACC TTCCTTTCAC TCCATTCATT 
AGGCATTTGG ATGCCTCTCT GTTGGGGCCT 
GCCTTCATTA GGTGGCCCTA GGGAGATGGC 
GGGTCTTGGG TCTATTGGCA TGTCCATGGC 
TGAAGTTTGG CTAAAGGTTG GTGTAAAAAT 
CATGGATTAG CTGTGCAACT GACCAGCTCC 
CATGTGGTCT GACCATGTGG AGATGTTTCT 
TTTGTAGTTA CGATTTTTGG AATCCCACTT 
CTTACACCTT GGGCTTGGAT ATTGCCCAGA 
ACAAGAGACA GTTGCTGTTC TCATGTTCCA 
TGCCTGGAAG AGTTCACTGT CAT T GAG C AG 
TTATTCCACT GCCTTATTTG ACAAGGGGTT 
AAATCAGTTA CAGGCCAGAG TCTCCTTGGA 
CTCTGTA 



GAGATGTGTT CTTGCCCTCA GTGGACAATA 900 
CGCCCCTGAG TGGCCTGGCC CTCTTCCTCA 960 
TTGCCATAGT CATTGGTATC ATACTCTACA 1020 
TCTACTGAGC CCTCCTGCTG CCACCACTTT 1080 
GCAGGCACTG GCCTGAGCAT GCAGCCTGGA 114 0 
GACTATATTC TGTCACTGGA GTTTTGAATG 1200 
GGGGACATCT AACTCTGGTC TGGGAAGCCA 12 60 
CCTTTCCCTG CAGTCCTTCC ATGTGGGAGC 1320 
ATGCCAAAAT CACAGAACAG AATTTCATAG 138 0 
GCCCTTCTAC TTCAGTTTTG AATCCACAAA 14 4 0 
CTGACCATCC ATTCGTTGGG TTTTGCATTT 1500 
TCTTTGGAAA CCAGGATGGA AACTTCTTCC 15 60 
GTCCTCTCTG TGTGCAACCT GAGCTGGGAA 1620 
GGGGCTGCAG AACACACCTG CGTTTCACTG 1680 
TTTCTGCTTT GGATCACTGT TCCCTAGCAT 17 4 0 
CTTCCCAATC AAGTCTCTTC AGGCCCTCAG 1800 
CAAGAGAAGC CTGGAAGACA TCATGGATGC 18 60 
AGGTTTGATC AAACCAAAAG CAACATTTGT 1920 
GGACTTGCTA GAGCCTGCTT AGCTGCATGT 1980 
TGAGTGCTGA AAGTGTAAGG AAGCTTTCTT 204 0 
GAAGAAATTT GGCTTTTTTT TTCTTAATGG 2100 
AGTCTGAGAG CAACAGACCC TCATCATCTG 2160 
CACAGCCTGA GTGCTGGCCT CTGTCAACCC 2220 
ACATGCTGCT CACCTTACTG CCCTGGGATT 228 0 
GGGCCTGGAA CTCTGAGTCC TCCTATGAAC 234 0 

2347 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1529 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: EOSIHET02 

(B) CLONE: 322866 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 9 : 
CCCACGCGTC CGCCAGCCTT GTCTCGGCCA CCTCAAGGAT AAT C AC T AAA TTCTGCCGAA 60 
AGGACTGAGG AACGGTGCCT GGAAAAGGGC AAGAAT AT CA CGGCATGGGC ATGAGTAGCT 120 
TGAAACTGCT GAAGTATGTC CTGTTTTTCT TCAACTTGCT CTTTTGGATC TGTGGCTGCT 180 
GCATTTTGGG CTTTGGGATC TACCTGCTGA TCCACAACAA CTTCGGAGTG CTCTTCCATA 240 
ACCTGCCCTC CCTCACGCTG GGCAATGTGT TTGTCATCGT GGGCTCTATT ATCATGGTAG 300 
TTGCCTTCCT GGGCTGCATG GGCTCTATCA AGGAAAACAA GTGTCTGCTT ATGTCGTTCT 360 
TCATCCTGCT GCTGATTATC CTCCTTGCTG AGGTGACCTT GGCCATCCTG CTCTTTGTAT 42 0 
AT GAAC AGAA GCTGAATGAG TATGTGGCTA AGGGTCTGAC CGACAGCATC CACCGTTACC 4 80 
AC T C AG AC AA TAGCACCAAG GCAGCGTGGG ACTCCATCCA GTCATTTCTG CAGTGTTGTG 54 0 
GTATAAATGG CACGAGTGAT TTGGACAGTG GCTCACCAGC ATCTTGCCCC TCAGATCGAA 600 
AAGTGGAGGG GTGCTATGCG AAAGAAGACT TTGGTTTCAT TCAATTTCCT GTATATCGGA 660 
AT CAT C AC C A TCTGTGTATG TGTGATTGAG GTGTTGGGGG ATGTCCTTTG CACTGACCCT 7 20 
GAACTGCCAG AT T GAC AAAA CCAGCCAGAC CATAGGGCTA TGATCTGCAG TAGTTCTGTG 780 
GTGAAGAGAC TTGTTTCATC TCCGGAAATG CAAAACCATT TATAGCATGA AGCCCTACAT 84 0 
GATCACTGCA GGATGATCCT CCTCCCATCC TTTCCCTTTT TAGGTCCCTG TCTTATACAA 900 
CCAGAGAAGT GGGTGTTGGC CAGGC AC AT C CCATCTCAGG CAGCAAGACA ATCTTTCACT 960 
CACTGACGGC AGCAGCCATG TCTCTCAAAG TGGTGAAACT AATATCTGAG CATCTTTTAG 1020 
ACAAGAGAGG CAAAGACAAA CTGGATTTAA TGGCCCAACA TCAAAGGGTG AACCCAGGAT 1080 
ATGAATTTTT GCATCTTCCC ATTGTCGAAT TAGTCTCCAG CCTCTAAATA ATGCCCAGTC 114 0 
TTCTCCCCAA AGTCAAGCAA GAGACTAGTT GAAGGGAGTT CTGGGGCCAG GCTCACTGGA 1200 
CCATTGTCAC AACCCTCTGT TTCTCTTTGA CTAAGTGCCC TGGCTACAGG AAT T ACACAG 12 60 
TTCTCTTTCT CCAAAGGGCA AGATCTCATT TCAATTTCTT TATTAGAGGG CCTTATTGAT 1320 
GTGTTCTAAG TCTTTCCAGA AAAAAACTAT CCAGTGATTT ATATCCTGAT TTCAACCAGT 138 0 
CACTTAGCTG ATAATCACAG TAAGAAGACT TCTGGTATTA TCTCTCTATC AGATAAGATT 14 4 0 
TTGTTAATGT ACTATTTTAC TCTTCAATAA ATAAAACAGT TTATTATCTC AAAAAAAAAA 1500 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 1529 
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(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4387 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BEPINOT01 

(B) CLONE: 546656 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80 : 
GCATCCCCGC TTCCGGGTTA GGCCGTTCCT GCCCGCCCCC TCCTCTCCTC CCTTCGGACC 60 
CATAGATCTC AGGCTCGGCT CCCCGCCCGC CGCAGCCCAC TGTTGACCCG GCCCGTACTG 120 
CGGCCCCGTG GCCACCATGT CCCTGCACGG CAAACGGAAG GAGATCTACA AGTATGAAGC 180 
GCCCTGGACA GTCTACGCGA TGAACTGGAG TGTGCGGCCC GATAAGCGCT TTCGCTTGGC 24 0 
GCTGGGCAGC TTCGTGGAGG AGTACAACAA CAAGGTTCAG CTTGTTGGTT TAGATGAGGA 300 
GAGTTCAGAG TTTATTTGCA GAAACACCTT TGACCACCCA TACCCCACCA CAAAGCTCAT 360 
GTGGATCCCT GACACAAAAG GCGTCTATCC AGACCTACTG GCAACAAGCG GTGACTATCT 4 20 
CCGTGTGTGG AGGGTTGGTG AAACAGAGAC CAGGCTGGAG TGTTTGCTAA ACAATAATAA 4 80 
GAACTCTGAT TTCTGTGCTC CCCTGACCTC CTTTGACTGG AATGAGGTGG ATCCTTATCT 540 
TTTAGGTACC TCAAGCATTG ATACGACATG CACCATCTGG GGGCTGGAGA CAGGGCAGGT 600 
GTTAGGGCGA GTGAATCTCG TGTCTGGCCA CGTGAAGACC CAGCTGATCG C C CAT G AC AA 660 
AGAGGTCTAT GATATTGCAT TTAGCCGGGC CGGGGGTGGC AGGGACATGT TTGCCTCTGT 72 0 
GGGTGCTGAT GGCTCGGTGC GGATGTTTGA CCTCCGCCAT C T AGAACAC A GC AC CAT CAT 78 0 
TTACGAAGAC CCACAGCATC ACCCACTGCT TCGCCTCTGC TGGAACAAGC AGGACCCTAA 84 0 
CTACCTGGCC AC CAT GGCC A TGGATGGAAT GGAGGTGGTG ATTCTAGATG TCCGGGTTCC 900 
CTGCACACCT GTCGCCAGGT TAAACAACCA TCGAGCATGT GTCAATGGCA TTGCTTGGGC 960 
CCCACATTCA TCCTGCCACA TCTGCACTGC AGCGGATGAC CACCAGGCTC TCATCTGGGA 1020 
CATCCAGCAA ATGCCCCGAG CCATTGAGGA CCCTATCCTG GCCTACACAG CTGAAGGAGA 1080 
GATCAACAAT GTGCAGTGGG CATCAACTCA GCCCGACTGG ATCGCCATCT G C T AC AAC AA 114 0 
CTGCCTGGAG ATACTCAGAG TGTAGTGTTG GTGGCGCTGT GCCCACGAGG CAGGGGCTTT 1200 
TGTATTTCCT GCCTCTGCCC CACCCCCAAA GTAAGAAGAA ACATGTTTCC AGTGGCCAGT 12 60 
ATGTCTTTCA TTGCTTTGCA CCCACTGTTA CCAGAAGCTG CTCTAGGAGT TCCTGGCCAG 1320 
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TCACCCCATC GCCCTCTGTG GCAGACTCAG TGCTGTGTGG CGCCTCCTCA GCCCAGGGCT 1380 
GAGTTTTAAG ATTTTCTCTC CTTTCCTCTT CTCCTTTGGT TCCTCAATTA AAAAATGTGT 14 4 0 
GTATATTTGT TTGTCAGGCG TTGTGTTGAG GAGCAGTTCA CGCACTGGCT GTGTCTATTC 1500 
CTCTGCCCAG GTGTCTCTGT TTGCTGCCCA AGGCAGCAGT TCATGTCTCG TCCATGTCCA 1560 
TGTTCGTGTT AGCACTTACG TGGGAACAAA TACCAATTTG TCTTTTCTCC TAGTATCAGT 162 0 
GTGTTTAACA AATTTTAACT TTGTATATTT GTTATCTATC AGGCTAATTT TTTTATGAAA 168 0 
AGAATTTTAC TCTCCTGCTT CATTTCTTTG TCTTATAGTC CTCCCTCTTT GCACCTTCTT 17 4 0 
CTCTTCCCTC AGTGCCTGGA GCTGGTACTG GGCCCCTGGG CCCCATGAGC AGTTTGCCTT 1800 
CTTGAGTCAC TGCCTGTGTA GTACATACCT GACCGGGAGT CCAAACCACC TTGGTGCTCT 18 60 
GAAGTCCACT GACTCATCAC ACCTTTCTTA GCCTGGCTCC TCTCAAGGGC ATTCTGGGCT 1920 
TGTAAACAGA CATAGGAAGC CTCTGTTTAC CCTGAAGCAC CACTGTCCAG CCCATTGGTT 1980 
CCCACTGGCA GCATGGTAGA GCTGAGAGAA ACAGGCTCTC AGGGTACCTG ACTTGAGGGG 204 0 
AATCGTTTCA TGAAGCTGAA CTTCAAGCAT ATTTCCAGTA CATTCTTTCA GAGTCTGTTT 2100 
TTCCATCCAA ATATAAGCCC CAGGCCATTC CACTTAGTGT CTTTTCAATG ATAGGCAAGA 2160 
ATGATATCTG AGTTGAACTT CGGTGCTTCT GTTGTTTGAG TTTACTGTGC CTGGTGGTAT 2220 
ATTGGGCATT CTTTGGATTG AGTGTTCTGA GGTGAGAGAG TCTTCCCGAG GCATCCTGTC 2280 
TGTGCTTCCA ACCCTGAACA AGACCTTACA TGAGAGATGG ACTGATGGAC TGCGGCAATC 2340 
CTGGGCTGTC AAGTGGATAG ATAGTTAAAA AGC AT TAT AC TGTGGGTAAT GAAAAGGGAG 24 00 
GAAAAAAAAA GAAGGAAAAG GAATTATAGA CCCCCAGGGT CAGCCAGTTA AGAGCTCTAC 24 60 
CCACACCTGT CAACCCCTCT CTCCCCCAGT TTAGGTTCTG AGCAGTATTG GACTTGTAGC 252 0 
CTGCAGTTGT CTTTTGACTT GCAGGCCGCA GGTGTCTTTC TGTTATGTGA ATGAGTTCCA 2580 
TGGAGGGGCA TATGTGTGAT TCCACCGTTA GATGAGCCCT TGGGGCAGGC AGTTTGGGAT 2 64 0 
GTGCTCTTGG GGGAAAGTTG GCTGTTTCCT TGCGCTCTGC TCCTACCCGA AGGTTTTTAA 2700 
GTCCCTCTGA ATTGCTCATC TGAGATTAGT AGAGTAGCAG GCCTGAAGGA TGATGGTTTT 27 60 
GTCCTCTTTG GTTCTCACCT GCTTGAGAAG TAAAACAGTA ACTTTGTTCT TCTGGGCCCT 2820 
TAAGCTTTTT TGGTTAAGTC TTCCTTTTCA GAAGTAGATG TCATTATATG CCAAAAGTCT 288 0 
AGCTCTTTGC TTTACCATAC AGGGACCTGT C C C AAAG AAA AAGGCTCTTT TTTTAGCCAG 294 0 
CATATTTCCC CTTCTACCCT TTTACTTTGT TGTTCTGATT TTAGGACTCT GGCTGGCCAT 3000 
GTGCTTGTGG TTGCCTCTCC TGCATTTGCC ACTGGATTTG CACTGCATCG TTTGGAGATA 3060 
CAAAGCGAGC AGTTCTTGGT CAGAACCCTC CTCTGCTTTT CATTGTGTTT GATAATGGTT 3120 
ACTGGGTCCT TCTCTCAAGG GTAGCAAGGC C AAGCT GAT G GCTGCTTGTT TAGGAGGCCA 3180. 
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TCAGTTCCTT CCTGTGGAGA AGGGTCTGAA ATGGAAGTCA GTGGTAGAAG GGGCTGGTCT 324 0 
GCTGGGCAGG GCTTACATCC ACTGAGTTCT AAGATTCCTT TCCTGATCTG CACCTACGCC 3300 
TGGTCTGTAT GGTGGAATTT GTCAGCTGGA ACTCAGAAAC AACAACTTGA AAAAAAAATA 3360 
ATAATTAGAA CATATTTGCA TAAGATAGCT ATTTACTCTG GAAACCAACA ACTTTTGAGA 34 20 
TTTCCCTTGC CCTGTGGACG CCCAGCTCCT GTCATCCTTC CTTAGGTCCT GCAGTACAGT 34 8 0 
CTTCCCCTGA ATGCCACCGG GGACCCAGGG GGACTCCACC CCCCTAAGCA AG C AC AC AC A 354 0 
TACTCACAGT TGATGAGTTG CTGGTCTTTG AGTCCCAGCT CTCTTACCCT CCCTTTACTC 3600 
CACCAGCCCG ACGACCCATG ACTGAGGAGG GGATTTCTAC AGTCTCAGGA TTTAGAAAGT 3660 
CTGTAAGCCA TCCATGCTCC AGAAAGCACC GATCTGTTGT AGTTGCAAAA ACAACTCTGT 372 0 
AATTTGTTGA GGTTCTCAAA CTGACAGCCA GCGAGACTGG GTGGGAGGCC CTGGATCTGT 37 8 0 
TCTCCCTGAC TGCGGGAGGA GCAGCCACTA GGACTTTAGC AGGAAGCCCA CATGGAGGCT 384 0 
CCGCCAGGCT GTGGCCCAGC TGGTGATGGC CCTTTTGCTC CTGGCAGCCT GAGGCACAGC 3900 
TGCCTGTATT GTCCTCATCT GTTCTGACTG AAGGATGGAG GTGCTGAATA AATTAGGCCT 3960 
CAGGCCTCTA C C AC C AG AG A GCT GGAGAAT GGGTCCACGT CAT T C AAGGA CCTGAATTTT 4 020 
TTATGCTCAG GAGCATTGGA ATCCTCTTCT TCCAGGGAGG AATTAGCCTG CAAGGTTAGG 4 080 
ACTTGAAGAG GGAAGGTATT TAATAACTGG GCGAGGATGG GTGTGGTGGC TCACACCTGT 414 0 
AATCCCAGCA TTTTGGGAGG CTGAGGTGGC CAGATCCCAA GGTCAGAAGA TCGAGACCAT 4 200 
CCTGGCTAAC ATGGTGAAAC CCCATCTCTA CTAAAAATAC AAAAAAAAAT TAGCCGGGGG 4 2 60 
TGGTGGCGGG TACCTGTAGT CCTAGCTACT TGGGAGGCTG AGGCAGGAGA ATGGCGTGAA 4 320 
CCTGGGAGGT GGAGCTTGCA GTGAGCCAAG ATCGTCCACT CACTGCAGCC TGGCGACAGA 4 38 0 
GCAAGCG 4387 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2117 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SYNORAT03 

(B) CLONE: 693453 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81 : 
GCCTGAGCGG GAAGCATTGG CGTCCGAGCG ACTTCTAGGA GCCTGGGGTT CGGCGCTATG 60 
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GAGGAGCTCG ATGGCGAGCC AACAGTCACT TTGATTCCAG GCGTGAATTC CAAGAAGAAC 120 
CAAATGTATT TTGACTGGGG TCCAGGGGAG ATGCTGGTAT GTGAAACCTC CTTCAACAAA 180 
AAAGAAAAAT CAGAGATGGT GCCAAGTTGC CCCTTTATCT ATATCATCCG TAAGGATGTA 2 40 
GAT GTT TACT CTCAAATCTT GAGAAAACTC TTCAATGAAT CCCATGGAAT CTTTCTGGGC 300 
C T C C AG AG AA TTGACGAAGA GTTGACTGGA AAATCCAGAA AATCTCAATT GGTTCGAGTG 360 
AGTAAAAACT ACCGATCAGT CAT C AG AG C A TGTATGGAGG AAAT GCACCA GGTTGCAATT 420 
GCTGCTAAAG ATCCAGCCAA TGGCCGCCAG TTCAGCAGCC AGGTCTCCAT TTTGTCAGCA 4 80 
ATGGAGCTCA TCTGGAACCT GTGTGAGATT CTTTTTATTG AAGTGGCCCC AGCTGGCCCT 54 0 
CTCCTCCTCC ATCTCCTTGA CTGGGTCCGG CTCCATGTGT GCGAGGTGGA CAGTTTGTCG 600 
GCAGATGTTC TGGGCAGTGA GAATCCAAGC AAAC AT GAC A GCTTCTGGAA CTTGGTGACC 6 60 
ATCTTGGTGC TGCAGGGCCG GCTGGATGAG GCCCGACAGA TGCTCTCCAA GGAAGCCGAT 720 
GCCAGCCCCG CCTCTGCAGG CATATGCCGA ATCATGGGGG ACCTGATGAG GACAATGCCC 7 80 
ATTCTTAGTC CTGGGAACAC CCAGACACTG ACAGAGCTGG AGCTGAAGTG GCAGCACTGG 840 
CACGAGGAAT GTGAGCGGTA CCTCCAGGAC AGCACATTCG CCACCAGCCC TCACCTGGAG 900 
TCTCTCTTGA AGATTATGCT GGGAGACGAA GCTGCCTTGT TAGAGCAGAA GGAACTTCTG 9 60 
AGTAATTGGT ATCATTTCCT AGTGACTCGG CTCTTGTACT CCAATCCCAC AGTAAAACCC 1020 
ATTGATCTGC ACTACTATGC CCAGTCCAGC CTGGACCTGT TTCTGGGAGG TGAGAGCAGC 108 0 
CCAGAACCCC TGGACAACAT CTTGTTGGCA GCCTTTGAGT TT GAC AT CCA TCAAGTAATC 1140 
AAAGAGTGCA GCATCGCCCT GAGCAACTGG TGGTTTGTGG CCCACCTGAC AGACCTGCTG 1200 
GAC C AC T G C A AGCTCCTCCA G T C AC AC AAC CTCTATTTCG GTTCCAACAT GAG AG AGT T C 12 60 
CTCCTGCTGG AGTACGCCTC GGGACTGTTT GCTCATCCCA GCCTGTGGCA GCTGGGGGTC 1320 
GAT TACT TTG ATTACTGCCC CGAGCTGGGC CGAGTCTCCC TGGAGCTGCA CATTGAGCGG 138 0 
ATACCTCTGA ACACCGAGCA GAAAGCCCTG AAGGTGCTGC GGATCTGTGA GCAGCGGCAG 14 4 0 
ATGACTGAAC AAGTTCGCAG CATTTGTAAG ATCTTAGCCA TGAAAGCCGT CCGCAACAAT 1500 
CGCCTGGGTT CTGCCCTCTC TTGGAGCATC CGTGCTAAGG ATGCCGCCTT TGCCACGCTC 1560 
GTGTCAGACA GGTTCCTCAG GGATTACTGT GAGCGAGGCT GCTTTTCTGA TTTGGATCTC 162 0 
ATTGACAACC TGGGGCCAGC CATGATGCTC AG T GAC C GAC TGACATTCCT GGGAAAGTAT 168 0 
CGCGAGTTCC ACCGTATGTA CGGGGAGAAG CGTTTTGCCG ACGCAGCTTC TCTCCTTCTG 17 4 0 
TCCTTGATGA CGTCTCGGAT TGCCCCTCGG TCTTTCTGGA TGACTCTGCT GACAGATGCC 1800 
TTGCCCCTTT TGGAACAGAA ACAGGTGATT TTCTCAGCAG AACAGACTTA TGAGTTGATG 18 60 
CGGTGTCTGG AGGACTTGAC GTCAAGAAGA CCTGTGCATG GAGAATCTGA TACCGAGCAG 192 0 
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CTCCAGGATG ATGACATAGA GACCACCAAG 
AATCTTGCTC GGGCAATTAT AAGAGAAGGC 
TGTGGTATCT TTGTATGGCA AT G TAT AT AG 
AAAAAAAAAA AAAAAAA 




GTGGAAATGC TGAGACTTTC TCTGGCACGA 198 0 
TCACTGGAAG GTTCCTGAGA ACTGCTTCAA 2040 
ATTTTTTAAA AGAATAAATG TTGTTTGCAA 2100 

2117 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 846 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT03 

(B) CLONE: 866885 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82 : 



GGCGGGCGGA 


GTCTGCAGGA 


TGGCACCGGA 


CCCCTGGTTC 


TCCACATACG 


ATTCTACTTG 


60 


TCAAATTGCC 


CAAGAAATTG 


CTGAGAAAAT 


TCAACAACGA 


AATCAATATG 


AAAGAAAAGG 


120 


TGAAAAGGCA 


CCAAAGCTTA 


CCGTGACAAT 


CAGAGCTTTG 


TTGCAGAACC 


TGAAGGAAAA 


180 


GATCGCCCTT 


TTGAAGGACT 


TATTGCTAAG 


AGCTGTGTCA 


AC AC AT C AG A 


TAACACAGCT 


240 


TGAAGGGGAC 


CGAAGACAGA 


ACCTCTTGGA 


TGATCTTGTA 


ACTCGAGAGA 


GACTACTTCT 


300 


GGCATCCTTT 


AAGAAT GAGG 


GTGCCGAACC 


AGATCTAATC 


AGGTCCAGCC 


TGATGAGTGA 


360 


AGAGGCTAAG 


CGAGGAGCAC 


CCAACCCTTG 


GCTCTTTGAG 


GAGCCAGAGG 


AGACCAGAGG 


420 


CTTGGGTTTT 


GATGAAATCC 


GGCAACAGCA 


GCAGAAAATT 


ATCCAAGAAC 


AGGATGCAGG 


480 


CCTTGATGCC 


CTTTCCTCTA 


TCATAAGTCG 


CCAAAAACAA 


ATGGGGCAGG 


AAATTGGGAA 


540 


TGAATTGGAT 


GAACAAAATG 


AGATAATTGA 


CGACCTTGCC 


AACCTAGTGG 


AGAACACAGA 


600 


TGAAAAACTT 


CGCAATGAAA 


CCAGGCGGGT 


AAACATGGTG 


G AC AG AAAG T 


CAGCCTCTTG 


660 


TGGGATGATC 


ATGGTGATTT 


TACTGCTGCT 


TGTGGCTATC 


GTGGTTGTTG 


CAGTCTGGCC 


720 


GACCAACTGA 


TGGCAGTAAA 


GAGACCACCA 


GCAGTGACAC 


CTGGCAATGA 


CAGATGCAAG 


780 


CCCAACACCC 


TTTTGGTACG 


CAAAACCTGC 


TCTCAATAAA 


TTCCCCCAAA 


GCTCTGAAAA 


840 


AAAAAA 












846 



(2) INFORMATION FOR SEQ ID NO: 83: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1011 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT03 

(B) CLONE: 1242271 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 : 
GAAAGAGATA ACTGGAAGTT CCTTGATTCA GAAAACAGAT TCAGATGAAG AAGTTGCAAT 60 
GCTGTTGGAC ACAGTCCAGA AAGTATTTCA GAAAATGTTG GAATGTATTG CACGGAGCTT 120 
CAGGAAGCAG CCGGAAGAAG GCCTGCGGCT GCTTTATTCT GTTCAGAGGC CTCTTCATGA 18 0 
GTTCATTACT GCTGTTCAGT CTCGGCACAC AGACACCCCT GTGCACCGGG GTGTACTTTC 24 0 
TACTCTGATC GCTGGGCCTG TGGTTGAGAT AAGTCACCAG CTACGGAAGG TTTCTGACGT 300 
AGAAGAGCTT ACCCCTCCAG AGCATCTTTC TGATCTTCCA CCATTTTCAA GGTGTTTAAT 360 
AGGAATAATA ATAAAGTCTT CGAATGTGGT CAGGTCATTT TTGGATGAAT TAAAGGCATG 420 
TGTGGCTTCT AATGATATTG AAGGCATTGT GTGCCTCACG GCTGCTGTGC ATATTATCCT 4 80 
GGTTATTAAT GCAGGTAAAC AT AAAAGC T C AAAAGTGAGG GAGGTTGCAG CCACTGTTCA 54 0 
CAGAAAACTA AAGACAT TCA TGGAAATTAC TTTGGAAGAG GATAGCATTG AAAGATTTCT 60 0 
CTATGAATCA TCATCAAGAA CTCTGGGAGA ACTTTTGAAT T CAT AACC AA GCCAACATCT 660 
CCAGACATGT AAAAAT AG GG AAAAGTGATT CAAATTGAAA TGCCTGTGTA TTTTCCTATT 720 
GTTTTTAATG TTAATAACCC ATATAATAGG GAAAGGGTGG GATTTTTTTG TGGGAATGTG 780 
GGAAGGTGGG GGTTATGGAG GAGATAACTC AAAACTTCTT CAATTTTGCC TAGTGCCTGC 840 
GTAAATAATA TATTTAATAT AAAGGACTCC AGGTATGAAT GG T G TAG AAA TCCATGATTC 900 
CAAGAAAAAA CACTTTTCTA GCAAACCTGG TTGTTTTTAA AATGACTTTT ATATATGTAA 960 
TATTGCTTGG AAACTATGAG TAATAAAGCA AT GAC AAC AT CAAAAAAAAA A 1011 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGFET03 

(B) CLONE: 1255027 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 : 
CCCACGCGTC CGCCCACGCG TCCGCAGCGC TGTGTTTGCG AGCGGGAGCG AGGGGCGCCG 60 
GCTGGGGTGT GTGCTCCTGA GCTCTTCAGA AACCAGGCTG CTTTCAGGAA CATTGCTGTG 120 
GATTCCCAGC T T T C AG AC AA CACATGACTA AGAC AG AT GA GACCACTCTA GTTGCCTCAT 18 0 
GGGAAACTCG GGAAAAGACT GCAAAAACAA CATTGTTTCT CCCTTTGGAA TTCTGGAGTT 24 0 
ATAAGGCAGA GGTCCCCCAT CTTCCCGAAC TGGCCTATTC CGCTAGAAGC AAGATGGCTG 300 
AACTCAATAC TCATGTGAAT GTCAAGGAAA AGATCTATGC AGTTAGATCA GTTGTTCCCA 360 
ACAAAAGCAA TAATGAAATA GTCCTGGTGC TCCAACAGTT TGATTTTAAT GTGGATAAAG 4 20 
CCGTGCAAGC CTTTGTGGAT GGCAGTGCAA TTCAAGTTCT AAAAGAATGG AATATGACAG 4 80 
GCAAAAAGAA GAACAATAAA AGAAAAAGAA GCAAGTCCAA GCAGCATCAA GGCAACAAAG 54 0 
ATGCIAAAGA CAAGGTGGAG AGGCCTGAGG CAGGGCCCCT GCAGCCGCAG CCACCACAGA 600 
TTCAAAACGG CCCCATGAAT GGCTGCGAGA AGGACAGCTC GTCCACAGAT TCTGCTAACG 660 
AAAAACCAGC CCTTATCCCT C G T G AG AAAA AGATCTCGAT ACTTGAGGAA CCTTCAAAGG 72 0 
CACTTCGTGG GGTCACAGAA GGCAACAGAC TACTGCAACA GAAACTATCC TTAGATGGGA 780 
ACCCCAAACC TATACATGGA ACAACAGAGA GGTCAGATGG CCTACAGTGG TCAGCTGAGC 84 0 
AGCCTTGTAA CCCAAGCAAG CCTAAGGCAA AAACATCTCC TGTTAAGTCC AATACCCCTG 900 
CAGCTCATCT TGAAATAAAG CCAGATGAGT TGGCAAAGAA AAGAGGCCCA AAT AT T GAGA 960 
AATCAGTGAA GGATTTGCAA CGCTGCACCG TTTCTCTAAC TAGATATCGC GTCATGATTA 1020 
AGGAAGAAGT GGATAGTTCC GTGAAGAAGA TCAAAGCTGC CTTTGCTGAA TTACACAACT 1080 
GCATCATTGA CAAAGAAGTT TCATTAATGG CAGAAATGGA TAAAGTTAAA GAAGAAGCCA 114 0 
TGGAAATCCT GACTGCTCGT CAGAAGAAAG CAGAAGAACT AAAGAGACTC ACTGACCTTG 1200 
CCAGTCAGAT GGCAGAGATG CAGCTGGCCG AACTCAGGGC AGAAATTAAG CACTTTGTCA 12 60 
GCGAGCGTAA ATATGACGAG GAGCTCGGGA AAGCTGCCCG GTTTTCCTGT G AC AT C GAAC 1320 
AGCTGAAGGC CCAAATCATG CTCTGCGGAG AAATTACACA TCCAAAGAAC AACTATTCCT 1380 
CAAGAACTCC CTGCAGCTCC CTGCTGCCTC TGCTGAATGC GCACGCAGCA ACCTCTGGGA 14 4 0 
AACAGAGTAA CTTTTCCCGA AAAT CAT CCA C T C AC AAT AA GCCCTCTGAA GGCAAAGCGG 1500 
CAAACCCCAA AATGGTGAGC AGTCTCCCCA GCACCGCCGA CCCCTCTCAC CAGACCATGC 1560 
CGGCCAACAA GCAGAATGGA TCTTCTAACC AAAG AC G GAG ATTTAATCCA CAGTATCATA 1620 
ACAACAGGCT AAATGGGCCT GCCAAGTCGC AGGGCAGTGG GAATGAAGCC GAGCCACTGG 168 0 
GAAAGGGCAA CAGCCGCCAC GAACACAGAA GACAGCCGCA CAACGGCTTC CGGCCCAAAA 17 4 0 
ACAAAGGCGG TGCCAAAAAT CAAGAGGCTT CCTTGGGGAT GAAGACCCCC GAGGCCCCGG 18 00 
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CCCATTCTGA 


AAAGCCCCGG 


CGAAGGCAGC 


ACGCTGCAGA 


CACCTCGGAG 


GCCAGGCCCT 


1860 


TCCGGGGTAG 


TGTCGGTAGG 


GTTTCACAGT 


GCAATCTCTG 


CCCCACGAGA 


AT AGAAGT T T 


1920 


CCACAGATGC 


AGCAGTTCTC 


TCAGTCCCGG 


CTGTGACGTT 


GGTGGCCTGA 


GCTAGGAGGA 


1980 


AAAAGAGCAG 


TTTTCACTCA 


GTTTTGGTTC 


CCTGCCCGAG 


GTGCTGACCC 


AATTCGCTGC 


2040 


CAAAAGAGTG 


TCAAT CAGAA 


TATACAAATC 


CCGTATGGTT 


GTGTCATCCT 


CTCTTAATCA 


2100 


TTTTTACTAA 


TTCTAATAAT 


CAGCTCTAGC 


TTGCTTCATA 


ATTTTCATGG 


CTTTGCTTGA 


2160 


TCTGTTGATG 


CTTTCTCTCA 


TCAAGACTTT 


GCAGCATTTT 


AGCCAGGCAG 


TAT T T AC T C A 


2220 


TTATTAGGAA 


AATCAAGATG 


TGGCTGAAGA 


TCAGAGGCTC 


AGTTAGCAAC 


CTGTGTTGTA 


2280 


GCAGTGATGT 


CAGTCCATTG 


ATTGTCTTTA 


GAGAGTTAAT 


GTTACAAAAA 


AGAATTCTTA 


2340 


ATAATCAGAC 


AAACATGATC 


TGCTGAGGAC 


ACATGCGCTT 


TTGTAGAATT 


TAACATCTGG 


2400 


TGTTTTTCTG 


AAAAAATATA 


TATACATATA 


TTGCTTTATT 


TGAAACAAAT 


TAAAATATGC 


2460 


TGCATTTGAA 


AAAAAAAA 










2478 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TESTTUT02 

(B) CLONE: 1273453 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 : 
TGCACATCTA GCACAAATTG AAGATGATAG AGCTGCGATG GTTATTTCTT GGCATCTGGC 60 
AAGTGACATG GACTGTGTAG TCACCCTAAC CACTGACGCT GCACGTCGTA TCTATGATGA 120 
AACCCAAGGT CGTCAGCAGG TGTTGCCCCT TGATTCTATT TACAAGAAGA CTCTTCCAGA 18 0 
TTGGAAAAGA TCTCTACCTC ATTTCCGAAA TGGAAAATTG TATTTTAAAC C CAT T GG AG A 24 0 
TCCAGTCTTT GCTCGAGACT TGTTAACATT TCCAGATAAT GTAGAAC AT T GTGAAACAGT 300 
ATTTGGTATG CTGTTAGGAG ACACCATTAT TTTGGATAAT CTGGATGCGG CCAATCATTA 360 
TAGAAAAGAG GTTGTTAAAA TTACACACTG TCCTACACTG CTGACCAGAG ATGGAGATCG 420 
AATTCGAAGT AATGGAAAGT TTGGGGGCCT TCAGAATAAA GCTCCTCCAA TGGATAAACT 4 80 
TCGGGGAATG GTATTTGGAG CTCCAGTTCC AAAACAGTGT CTGATCTTAG GGGAACAAAT 54 0 
AGATCTTCTT CAGCAGTATC GTTCTGCTGT GTGCAAACTA GACAGTGTGA ATAAGGATCT 600 



202 



PF-0459 US 

TAACAGTCAA TTAGAGTACC TTCGCACTCC 
TGAACATGAG AAAAATCTCA AACTAATAGA 
GTGTAATGAC TCATTGCGTC AT T C AC C AAA 
TAAAAGAATG AGACGAGAAG CTACAAGACA 
AGAGGTGACA GAGAGAAGAG GCCATTGGTC 
TGTTTCAGAA GACCAAGAGG GTGACTTACC 
ACCTGGGCAT GAATTTCCAT TTCGATTCAG 
GAATCTTACT GGACATTATG GATTTACTGG 
TCACTACCTT GCAAATGTGT AAGAGGAAAA 
GCACATGGCA TTTAT TAATC CTGAAGAAAA 
ATGAACATGT CAGAACTATT TCTTGAAAAC 
ACAAAGATGT TTTGTCTTTT GTGTAAGGGA 
TATGTGAGGA AACTCAATGC AGAATTCAGG 
AATTGAAATG TTAAGATACC CAGAACAACA 
TAGCATTTCA AATTTCAAAA GACTTATCCT 
ATATATATAT AAATATATAT ATATAAAATA 
TTGTTTGACT TTATTAATAC TAGAATATGT 
TTTTGTAATT TTTTATTACT ATTTTTAAGG 
TGTACTTTCT GGTAGAAAGT TGCTGCAAAA 
GTCATATATA TGTCTTTGTG TAAGTTCAAG 
ACATACATTT GGTAAGTAAG TTTGTGTCCC 
AAAT AT GCAG GCCATTAATA AATAAGATTG 



GGATATGAGG AAGAAAAAGC AAGAACTTGA 660 
GGAAAAACTA GGTATGACTC CCATACGTAA 7 20 
GGTTGAGACG ACAGATTGTC CAGTTCCTCC 7 80 
AAAT AGGAT T ATAACCAAAA CAGATGTATG 84 0 
TCAGTAAGAA TGCCCTGCTT TCTGCATCTC 900 
AGACTGAGTA TTTCTGGGGA CAATACAAGT 960 
ATGGGACTGG AAACAACCAT TCAATTTTAT 1020 
AATTATTCCA GACATTATGC CCTTTGGTTG 108 0 
TGTGCTAATG TGGCAGTGAC TGTAAAACTG 114 0 
GTACATGTAC TATTTTTCAG TAT AAAT AT A 1200 
CTTTTTATTA CTTTTGCGTG AATTTATTTA 12 60 
GGTTCTAGAG GCTAGATGTT TAATTGTAAA 1320 
AT AAAAAT T T TAAAAGCACA GGTATTTGGG 138 0 
TTAAATCAAT GAGTGAACTT GTGACAGTGG 14 4 0 
GTGTGTGTGT GTGTGTGTGT ATATATATAT 1500 
TTCAGCAGCA CCAAGTTTTA TAACTATTGT 15 60 
AGTCTCAGCC TTAATTTTAC ATTTACATTA 1620 
GGTTAAAGAG AACATACATT CTCACATTAG 168 0 
AC AT T T G AAA TGTATATTAA CCTAATGTAT 17 4 0 
AC TAT T GAT C TGTGAAGTTA TTTTGTAAGG 1800 
AGGAAATGTA TGTGTTTTTA AACCCTTTCT 1860 
TGTCTCA 18 97 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TESTTUT02 

(B) CLONE: 1275261 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 : 
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CCCACGCGTC CGGGGACATC CTGTTCTGAG TCAAGATTCC TCCTTCTGAA CATGGGACTT 60 
TCCAGAAGGA CCACAGCTCC TCCCGTGCAT CCACTCGGCC TGGGAGGTTC TGGATTTTGG 120 
CTGTCGAGGG AGTTTGCCTG CCTCTCCAGA GAAAGATGGT CATGAGGCCC CTGTGGAGTC 180 
TGCTTCTCTG GGAAGCCCTA CTTCCCATTA CAGTTACTGG TGCCCAAGTG CTGAGCAAAG 24 0 
TCGGGGGCTC GGTGCTGCTG GTGGCAGCGC GTCCCCCTGG CTTCCAAGTC CGTGAGGCTA 300 
TCTGGCGATC TCTCTGGCCT TCAGAAGAGC TCCTGGCCAC GTTTTTCCGA GGCTCCCTGG 360 
AGACTCTGTA CCATTCCCGC TTCCTGGGCC GAGCCCAGCT ACACAGCAAC CTCAGCCTGG 4 20 
AGCTCGGGCC GCTGGAGTCT GGAGACAGCG GCAACTTCTC CGTGTTGATG GTGGACACAA 480 
GGGGCCAGCC CTGGACCCAG ACCCTCCAGC TCAAGGTGTA CGATGCAGTG CCCAGGCCCG 54 0 
TGGTACAAGT GTTCATTGCT GTAGAAAGGG ATGCTCAGCC CTCCAAGACC TGCCAGGTTT 600 
TCTTGTCCTG TTGGGCCCCC AACATCAGCG AAATAACCTA TAGCTGGCGA CGGGAGACAA 660 
CCATGGACTT TGGTATGGAA CCACACAGCC TCTTCACAGA CGGACAGGTG CTGAGCATTT 720 
CCCTGGGACC AGGAGACAGA GATGTGGCCT ATTCCTGCAT TGTCTCCAAC CCTGTCAGCT 780 
GGGACTTGGC CACAGTCACG CCCTGGGATA GCTGTCATCA TGAGGCAGCA CCAGGGAAGG 840 
CCTCCTACAA AGATGTGCTG CTGGTGGTGG TGCCTGTCTC GCTGCTCCTG ATGCTGGTTA 900 
CTCTCTTCTC TGCCTGGCAC TGGTGCCCCT GCTCAGGGAA AAAGAAAAAG GATGTCCATG 960 
CTGACAGAGT GGGTCCAGAG ACAGAGAACC CCCTTGTGCA GGATCTGCCA TAAAGGACAA 102 0 
TATGAACTGA TGCCTGGACT ATCAGTAACC CCACTGCACA GGCACACGAT GCTCTGGGAC 1080 
ATAACTGGTG CCTGGAAATC ACCATGGTCC TCATATCTCC CATGGGAATC CTGTCCTGCC 114 0 
TCGAAGGAGC AGCCTGGGCA GCCATCACAC CACGAGGACA GGAAGCACCA GCACGTTTCA 1200 
CACCTCCCCC TTCCCTCTCC CATCTTCTCA TATCCTGGCT CTTCTCTGGG CAAGATGAGC 12 60 
CAAGCAGAAC ATTCCATCCA GGACACTGGA AGTTCTCCAG GATCCAGATC CAT GGGGAC A 1320 
TTAATAGTCC AAGGCATTCC CTCCCCCACC ACTATTCATA AAGTACTAAC CAACTGGCAC 1380 
CAAGAAAAAA TCCTCACTAA CCGCATCATC CGACAACTAA TAATTCACAC TACATCCAAA 144 0 
CATCACTTAG GCGGCGGGGC CGCCGACTGG TTCCGGGCTT AGGGTGGG 14 88 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1357 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNNOT1 6 

(B) CLONE: 1281682 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87 : 

CCGACTTTGT AGCATTTTTA TTTAAGCTAA AACAGAGCAC ATGTATATGT ACATAAGACA 60 

CATTAAATCT AT AAAT AC T A TTTATTCATT TTATATAAAC TAATGTAATG GAAAACAAAT 120 

TCTTATGACT TTGTGGTTTT ATAGATGTTC TAGAAACTTT GTATGTAGGT ATCTACAAAA 180 

TTAGTTCATT CCCCTGAATA TTTTTGCATT CATATTTTTG AGGTCTTGAT GTTTTCAGCC 24 0 

TCTGGCGAAT CTTTTTCATT GAATTTGAAC CATTTGTAAA ATCTGTGATG CTGAAGCAGA 300 

GTGTGTCACA AAGTGATGAG AACATTACTA AAATCCACGG ACGCACTGCG ACCTAAGGGC 3 60 

TCAACGGCTG ACTCGGCAGC GGGCAGCCAC CCCACGCTCC CCTGCGGTCA CTCGCACACC 42 0 

ACAGCCTGAA GCTCCCCCAG CGCCTGCACC TCGCACACAG CTAAGGTCAA AGTTCAAACG 480 

CACTCCACAC GGAAGCTCAT TCTATACCCG AAGAGCAGTC TCAGAAAGCA AGATTACTTT 54 0 

TGTGTTTTTT AAAAAATGAT TCTTTAATGT ATTTTTCTAA ACATTCTGAT TGGAAGTAGT 600 

GGATTCCTAA ATGATTCCAA AGTCATCTGT AATTCTTCTG TTTTTGTTTT GTTCTGTCTT 6 60 

TTCTTCATTT TGGCTTTGGG TGGGGGGAGG GGCAGGTGAC ACAAAGGATT TTTTTTTTTT 720 

TTTTTTTTTA ATTTTTGGAA TCTTTTCCAA TAACCAGC TA AAGATTTGCA CTGAAATACA 780 

ACTTGTATGC CTTTTGCATT TTTAAAGCCT GCTTCCTGGA TTTAAGCAGA GTGATAGTGT 84 0 

TCAAAGAGCC AGTTCAGCCT GTAACATATT T GAAAAAG AT ATGTCTGCAC TTTGAGGTCC 900 

CTTTTGAATG CCATTCACTA GACCTCTCAA GCATTTTGTT TCATTGCTAC ATCCAAGCGC 960 

CTCACAAGTC CACAATGCGG GACAGCATCA AAAGC T CAAG ACTTTGGAAA AAGCTTGTGG 1020 

GCTTGCACTG GGGGAGGGAA GGGAACAAAA TTTGTGTACT TCTTTGTTTA ATT TAG AAAT 1080 

AAGGCATCCA AGAGATGCCA TTATTTTCTG TGTTTCAATT GTTGTGCCTT TGAGTTAAAC 114 0 

TGCATTTTTG TCTTTTGGTT G AAAT C T G AA ATGTACTGTC CCAATATAAA ACAGTAATTA 1200 

TTTGACCTTT GCACTGTTTG TCTGGTCCTT TTCAGTTTGA TTGCATATAA ATGTGGAACT 12 60 

T GAT AG AT C T CTATATTTTT AATGCACTTG TGATAAACTG GCAGCAGGGT T AGACAT T AC 132 0 

TTTCAAAGCT TGAGGTAGAC CGAGTCAGCA TGCTAGA 1357 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2330 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT07 

(B) CLONE: 1298305 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 : 
CCTACTTGTT CCCACCTTGG GAGAGGACGA TGACTTGGGA GGGACGCGTG AAGGGAGAAG 60 
GGGTCCTCCC ATGAGGCTGA GGATGGCCTG AACCTGGAGC AGCGGACCAG GCAGACGGGC 120 
TGAAGTGGGG TCCCAAATTC CATGTCCAGA GGTGTGGGGA GCCTGCCTCC CTAGCTCCTG 18 0 
GCCCCTGCCA GGGGCTTACA TCAAAACACC TCAGAGGGCT GCCCTCCAGA GGCTGCACCC 24 0 
AGAACAGTGG G AC AT GAG C A GGGGTGTGGG CTTGGAGGGT GAAGAGGATG TGGTCCTATC 30 0 
AGATGCTGGG CCTCCTCAGC CATAGCCCCC TGCTCCTACC CCCTGACTGG CTCTTGTGTC 360 
CTCACCTCTC ACCCTCTCCT TCCTGGGAGG CCCTGGGAGG TGATCATTGA CACCCAGCCA 420 
AGCAGACAGC TGCGGGTGCC CAAGCCCTTG CTGGGCCTGC GCGTGAGGAG TCCCACTGCT 4 80 
TCTAAAGGAA GTCCTGGGCA GGAGGTGGCT TTGGTGGTTG GTTCCAAAGT TGAAAATGCT 54 0 
TGCAGTTTGA CCTTAGAAGA AGTGGGAAGA AGAAGGAGCT CTACAGGGTC AGCTTTGTTT 600 
GATTTGTCCA GTCTAAGAAG TCCCATTGCC AAAGCTTTCT GCAGGAGGGT GAATGCCGCA 660 
GCTTGGCAGC CCCTGGGTTT CTCTTGGAAA TGGTCAGTTT CCCCTCAAAG TACCCAAAGT 720 
AGCCTTGGCT TGAGTTTTTG TCCTTGCCTC CTTTTTAGAG AAGAGGGCAT TTAGACTGCA 78 0 
TTTTCCTGGT TAAAGAAGGT TAAAGCAAAT GTTTATTGCC TTTTCTAGTG AACTAACTCG 84 0 
TAGAGATGTT CTCAGCAGGA AGACAGTCTT AGCACTGTCA CTTAGCAGAT TGCACTTAAG 90 0 
TCCCTTGTGC TGGCCAGATG GCGTGGCTGG TTGCCTTAAT ATGTCCCAGG ACCCCTGACA 960 
GGGCTGCCTG GCCTCTCCCT CGTGCTCCTC AAGAGCCCAG TCCATACACT GTGGATGTCA 1020 
TTGCTGTCGG GTTAGGAAGT CTTGTCCTAG AACGCCCTGG CTGGTATGAC CACAGTTCAT 108 0 
GGCGGCTCTT CTCGCTTGGG TCATGGTCAT CTTCCAGCAC CTGCTGTGCT GGGAAGGCCG 114 0 
AGGATGGGGG CCCAGCACTG TCCAGGCCTG CTGGGGCCTG GCTGGGAGTC CTGTGGGCAG 1200 
CAT G G AAC AT GCAGCTGGGC TTCCTGTGAC CAGGCACCCT CTGGCACTGT TGCTTGCCCT 1260 
GTGCCCTGGA CCTTTTCCTG CCCTTCTCCT TCCTCTGCTC CCTTGGGGCT ACCCCTTGGC 1320 
CCCTCCTGGT CTGTGCAAAC TCCCTCAGGG AGCCCCCCTG CCCTGTAGCT CTCACTTAAC 138 0 
TTCCTAGGGG CTGCTGAGCC CACCCAGAGG TTGTTGGAGT TCAGCGGGGC AGCTTGTCTC 14 4 0 
CCTTGTCAGC AGGGGCGTAA GGGCTGGGTT TGGCCATACA AGGTTGGCTA CGCCCTCAAT 1500 
CCCTGACCGT TCCAGGCACT GAGCTGGGCA CCCACGGAAG GACATGCTGT CCAGACTGTG 15 60 



206 




PF-0459 US 



ATGACTGCCA 


GCACAGGGCA 


TCTCGGGCTT 


GGCTGGTCTG 


CGAGGCCTTG 


CCCCTGTGGA 


1620 


ACTCTGGGTT 


CCTGTTTTCT 


CAGTCTTTTT 


GCGGCTTTGC 


TGTGGTTGGC 


AGCTGCCGTA 


1680 


CTCCAGGCTT 


GTGTCGGCCA 


CTCAGATGAG 


GGCTGTGGTG 


CGAGCCAGTG 


CAGGAGAGCT 


1740 


GCGCTTGGGA 


TTGTGCCCTC 


TCCTGTGTCT 


GTCCTCCGGA 


CCTACCCAGG 


TCTCCACCAT 


1800 


CAGGACCCTG 


TCTTTGGGTT 


TAGAAGACCA 


AGTATGGGGA 


AAAC CAGAC A 


CCAGCCTCTG 


1860 


CAGCAATGGG 


TCCCTCTAGC 


CTGTGGACAC 


CAGCTGGGGG 


ATCCAGGGTC 


AGGCCCCCTC 


1920 


CTCTCCCCAG 


TTTCCCTCTG 


CTGTGGGTTC 


TGGGCTGTCA 


TGTCTCCACC 


ACTTAAGGAT 


1980 


GTCTTTACAC 


TGACTTCAGG 


ATAGATGCTG 


GGATGCCTGG 


GCATGGCCAC 


ATGTTACATG 


2040 


TACAGAACTT 


TGTCTACAGC 


ACAAATTAAG 


TTATATAAAC 


ACAGTGACTG 


GTATTTAATG 


2100 


CTGATCTACT 


ATAAGGTATT 


CTATATTTAT 


ATGACTTCAG 


AGACGCGTAT 


GTAATAAAGG 


2160 


ACGCCCTCCC 


TCCAGTGTCC 


ACATCCAGTT 


CACCCCAGAG 


GGTCGGGCAG 


GTTGACATAT 


2220 


TTATTTTTGT 


CTATTCTGTA 


GGCTTCCATG 


TCCAGAATCC 


TGCTTAAGGT 


TTTAGGGTAC 


2280 


CTTCAGTACT 


TTTTGCAATA 


AAAGTATTTC 


CTATCCAAAA 


AAAAAAAAAA 




2330 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT12 

( B) CLONE: 1360501 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 : 
CTACACCTTT TCCATTTGCT AATAAGGCCC TGCCAGGCTG GGAGGGAATT GTCCCTGCCT 60 
GCTTCTGGAG AAAGAAGATA TTGACACCAT CTACGGGCAC CATGGAACTG CTTCAAGTGA 12 0 
CCATTCTTTT TCTTCTGCCC AGTATTTGCA GCAGTAACAG CACAGGTGTT T T AGAGGC AG 180 
CTAATAATTC ACTTGTTGTT ACTACAACAA AAC CATC TAT AACAACACCA AACACAGAAT 24 0 
CATTACAGAA AAATGTTGTC ACACCAACAA CTGGAACAAC TCCTAAAGGA ACAATCACCA 30 0 
ATGAATTACT TAAAATGTCT CTGATGTCAA CAGCTACTTT TTTAACAAGT AAAGATGAAG 360 
GATTGAAAGC CACAACCACT GATGTCAGGA AGAATGACTC CATCATTTCA AACGTAACAG 420 
TAACAAGTGT TACACTTCCA AATGCTGTTT CAACATTACA AAGTTCCAAA CCCAAGACTG 480 
AAAC TC AG AG TTCAATTAAA ACAACAGAAA TACCAGGTAG TGTTCTACAA CCAGATGCAT 54 0 
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CACCTTCTAA AACTGGTACA TTAACCTCAA TACCAGTTAC AATTCCAGAA AACACCTCAC 600 

AGTCTCAAGT AATAGGCACT GAGGGTGGAA AAAATGCAAG CACTTCAGCA ACCAGCCGGT 660 

CTTATTCCAG TATTATTTTG CCGGTGGTTA TTGCTTTGAT TGTAATAACA CTTTCAGTAT 720 

TTGTTCTGGT GGGTTTGTAC CGAATGTGCT GGAAGGCAGA TCCGGGCACA CCAGAAAATG 780 

GAAATGATCA ACCTCAGTCT GATAAAGAGA GCGTGAAGCT TCTTACCGTT AAGACAATTT 84 0 

CTCATGAGTC TGGTGAGCAC TCTGCACAAG GAAAAACCAA GAACTGACAG CTTGAGGAAT 900 

TCTCTCCACA CCTAGGCAAT AATTACGCTT AATCTTCAGC TTCTATGCAC CAAGCGTGGA 9 60 

AAAG GAG AAA GTCCTGCAGA ATCAATCCCG ACTTCCATAC CTGCTGCTGG ACTGTACCAG 1020 

ACGTCTGTCC CAGTAAAGTG ATGTCCAGCT GACATGCAAT AATTTGATGG AATCAAAAAG 1080 

AACCCCGGGG CTCTCCTGTT CTCTCACATT TAAAAATTCC ATTACTCCAT TTACAGGAGC 1140 

GTTCCTAGGA AAAGGAATTT TAGGAGGAGA ATTTGTGAGC AGTGAATCTG ACAGCCCAGG 1200 

AGGTGGGCTC GCTGATAGGC ATGACTTTCC TTAATGTTTA AAGTTTTCCG GGCCAAGAAT 12 60 

TTTTATCCAT GAAGACTTTC CTACTTTTCT CGGTGTTCTT ATATTACCTA CTGTTAGTAT 1320 

TTATTGTTTA CCACTATGTT AATGCAGGGA AAAGTTGCAC GTGTATTATT AAATATTAGG 1380 

TAGAAATCAT AC CAT GC T AC TTTGTACATA TAAGTATTTT ATTCCTGCTT TCGTGTTACT 144 0 

TTTAATAAAT AACTACTGTA CTCAATACTC TAAAAATACT AT AACAT GAC TGTGAAAATG 1500 

GCAATGTTAT TGTCTTCCTA TAATTATGAA TATTTTTGGA TGGATTATTA GAATACATGA 1560 

ACTCACTAAT GAAAGGCATT TGTAATAAGT CAGAAAGGGA CATAGGATTC ACATATCAGA 1620 

CTGTTAGGGG GAGAGTAATT TATCAGTTCT TTGGTCTTTC TATTTGTCAT TCATACTATG 1680 

TGATGAAGAT GTAAGTGCAA GGGCATTTAT AACACTATAC TGCATTCATT AAGATAATAG 17 40 

GATCATGATT TTTCATTAAC TCATTTGATT GATATTATCT CCATGCATTT TTTATTTCTT 1800 

TTAGAAATGT AATTATTTGT TCTAGCAATC ATTGCTAACC TCTAGTTTGT AGAAAAT CAA 18 60 

CACTTTATAA ATACATAATT ATGATATTAT TTTTCATTGT ATCACTGTTC TAAAAATACC 1920 

ATATGATTAT AGCTGCCACT CCAT CAGGAG CAAATTCTTC TGTTAAAAGC TAACTGATCA 1980 

ACCTTGACCA CTTTTTTGAC ATGTGAGATC AAAGTGTCAA GTTGGCTGAG GTTTTTTGGA 2 040 

AAGCTTTAGA ACTAATAAGC TGCTGGTGGC AGCTTTGTAA CGTATGATTA TCTAAGCTGA 2100 

TTTTGATGCT AAATTATCTT AGTGATCTAA GGGGCAGTTT AGTGAAGATG GAATCTTGTA 2160 

TTTAAAATAG CCTTTTAAAA TTTGTTTTGT GGTGATGTAT TTTGACAACT TCCATCTTTA 2220 

GGAGTTATAT AATCACCTTG ATTTTAGTTT CCTGATGTTT GGACTATTTA TAATCAAGGA 2280 

CACCAAGCAA GCATAAGCAT ATCTATATTT CTGACTGGTG TCTCTTTGAG AAGGATGGGA 234 0 

AGTAGAAAAA AAAAAAAGAA AGAAAGGAAA G GAAGAG AG G AGAGAAGAAG GCAGGGATCT 2 4 00 
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CCACTATGTA TGTTTTCACT TTAGAACTGT 
CTTTAAATGG TGAGACAGTG ACTGGAGCAT 
AAAAAAAATC TGAGTTTGAG ACTAGCCTGG 
AATACAAAAA TTAGCCTGGT GTGGTGGCGC 
GAGGAACGTG AATCGCTTGA ACCCAGAAGA 
TTGCACTCCA GACTGGTGAC ACACGCAGA 



TGAGCCCATG CTTAATTTTA ATCTAGAAGT 2 4 60 
GCCAATCAGA GAGCATTTGT CTTCAGAAAA 2520 
CCAACATGTT GAAACCCCAT AT C T AC T AAA 2580 
ACGCCTGTAG TCCCAGCTAC TCTGGAGCCT 2 640 
CAGAGGTTGC AGTGAGCTGA GATGGCACTA 2700 

2729 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1386 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGNOT12 

(B) CLONE: 1362406 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 : 

GGCCCCTGCA CTGCTCCTGA TCCCTGCTGC CCTCGCCTCT TTCATCCTGG CCTTTGGCAC 60 

CGGAGTGGAG TTCGTGCGCT TTACCTCCCT TCGGCCACTT CTTGGAGGGA TCCCGGAGTC 120 

TGGTGGTCCG GATGCCCGCC AGGGATGGCT GGCTGCCCTG C AG AC C G C AG CATCCTTGCC 18 0 

CCCCTGGCAT GGGATCTGGG GCTCCTGCTT CTATTTGTTG GGCAGCACAG CCTCATGGCA 240 

GCTGAAAGAG TGAAGGCATG GACATCCCGG TACTTTGGGG TCCTTCAGAG GTCACTGTAT 300 

GTGGCCTGCA CTGCCCTGGC CTTGCAGCTG GTGATGCGGT ACTGGGAGCC CATACCCAAA 3 60 

GGCCCTGTGT TGTGGGAGGC TCGGGCTGAG CCATGGGCCA CCTGGGTGCC GCTCCTCTGC 420 

TTTGTGCTCC ATGTCATCTC CTGGCTCCTC ATCTTTAGCA TCCTTCTCGT CTTTGACTAT 480 

GCTGAGCTCA TGGGCCTCAA ACAGGTATAC TACCATGTGC TGGGGCTGGG CGAGCCTCTG 54 0 

GCCCTGAAGT CTCCCCGGGC TCTCAGACTC TTCTCCCACC TGCGCCACCC AGTGTGTGTG 600 

GAGCTGCTGA CAGTGCTGTG GGTGGTGCCT ACCCTGGGCA CGGACCGTCT CCTCCTTGCT 660 

TTCCTCCTTA CCCTCTACCT GGGCCTGGCT CACGGGCTTG AT CAGC AAGA CCTCCGCTAC 720 

CTCCGGGCCC AGCTACAAAG AAAACTCCAC CTGCTCTCTC GGCCCCAGGA TGGGGAGGCA 7 80 

GAGTGAGGAG CTCACTCTGG TTACAAGCCC TGTTCTTCCT CTCCCACTGA ATTCTAAATC 84 0 

CTTAACATCC AGGCCCTGGC TGCTTCATGC CAGAGGCCCA AATCCATGGA CTGAAGGAGA 900 

TGCCCCTTCT ACTACTTGAG ACTTTATTCT CTGGGTCCAG CTCCATACCC TAAATTCTGA 960 
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GTTTCAGCCA CTGAACTCCA AGGTCCACTT 
GTCATCTGTC CCTTCACTGT TTAGAGCATG 
GGAAAGGATC TGCCCTGACC ACTCCCCTGG 
CCCCTTCTGC ACCGCTGGCT TCCACTCCAA 
GTCATAGCTG TCCCTCCAGG CCCCAACCTT 
CCTCCTTAGG CCCTGCCTCT GGGCTCAGAC 
TTAACTCGAT GACTTGGGGC TCCCTGCTCT 
AGTCAG 



CTCACCAGCA AGGAAGAGTG GGGTATGGAA 102 0 
ACACTCTCCC CCTCAACAGC CTCCTGAGAA 108 0 
CACTGTTACT TGCCTCTGCG CCTCAGGGGT 114 0 
GAAGGTGGAC CAGGGTCTGC AAGTTCAACG 1200 
GCCTCACCAC TCCCGGCCCT AGTCTCTGCA 1260 
CCCAACCTAG TCAAGGGGAT TCTCCTGCTC 1320 
CCCGAGGAAG ATGCTCTGCA GGAAAATAAA 1380 

1386 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LATRTUT02 

(B) CLONE: 1405329 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 : 
CCCGGGCCAT GCAGCCTCGG CCCCGCGGGC GCCCGCCGCG CACCCGAGGA GATGAGGCTC 60 
CGCAATGGCA CCTTCCTGAC GCTGCTGCTC TTCTGCCTGT GCGCCTTCCT CTCGCTGTCC 120 
TGGTACGCGG CACTCAGCGG CCAGAAAGGC GACGTTGTGG ACGTTTACCA GCGGGAGTTC 18 0 
CTGGCGCTGC GCGATCGGTT GCACGCAGCT GAGCAGGAGA GCCTCAAGCG CTCCAAGGAG 240 
CTCAACCTGG TGCTGGACGA GATCAAGAGG GCCGTGTCAG AAAGGCAGGC GCTGCGAGAC 300 
GGAGACGGCA ATCGCACCTG GGGCCGCCTA ACAGAGGACC CCCGATTGAC GCCGTGGAAC 3 60 
GGCTCACACC GGCACGTGCT GCACCTGCCC ACCGTCTTCC ATCACCTGCC ACACCTGCTG 420 
GCCAAGGAGA GCAGTCTGCA GCCCGCGGTG CGCGTGGGCC AGGGCCGCAC CGGAGTGTCG 4 80 
GTGGTGATGG GCATCCCGAG CGTGCGGCGC GAGGTGCACT CGTACCTGAC TGACACTCTG 54 0 
CA 542 



(2) INFORMATION FOR SEQ ID NO: 92: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 772 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAINOT12 

(B) CLONE : 1415223 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 : 
CGAGCCCGGA GTGCGGACAC CCCCGGGATG CTTGCGCCCC AGAGGACCCG CGCCCCAAGC 60 
CCCCGCGCCG CCCCCAGGCC CACCCGGAGC ATGCTGCCTG CAGCCATGAA GGGCCTCGGC 12 0 
CTGGCGCTGC TGGCCGTCCT GCTGTGCTCG GCGCCCGCTC ATGGCCTGTG GTGCCAGGAC 18 0 
TGCACCCTGA CCACCAACTC CAGCCATTGC ACCCCAAAGC AGTGCCAGCC GTCCGACACG 24 0 
GTGTGTGCCA GTGTCCGAAT CACCGATCCC AGCAGCAGCA GGAAGGATCA CTCGGTGAAC 300 
AAGATGTGTG CCTCCTCCTG TGACTTCGTT AAGCGACACT TTTTCTCAGA CTATCTGATG 3 60 
GGGTTTATTA ACTCTGGGAT CTTAAAGGTC GACGTGGACT GCTGCGAGAA GGATTTGTGC 42 0 
AATGGGGCGG CAGGGGCAGG GCACAGCCCC TGGGCCCTGG CCGGGGGGCT CCTGCTCAGC 480 
CTGGGGCCTG CCCTCCTCTG GGCTGGGCCC TGATGTCTCC TGCTTCCCAC GGGGCTTCTG 54 0 
AGCTTGCTCC CCTGAGCCTG TGGCTGCCCT CTCCCCAGCC TGGCGTGGCT GGGGCTGGGG 600 
GCAGCCTTGG GCCAGCTCCG TGGCTGTGGC CTGTGGGTCT GAATTCTTCC CCGACGTGAA 6 60 
GCCTNCCTGT CTCTCCGGCA GCTCTGAGTC CCAGGCAGCT GGACATTCCA GGGGAACAAG 720 
CCATTNGGCA GGAGGGCTGG GATGAGGTTG GGGGGGACCG GAGGTCCCGG AG 7 72 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAINOT12 

(B) CLONE: 1416553 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 : 
TGTCCATCCA AAAAC CAT AA AATCACTGGG TTCCACATCA GCCTCCATGA GGCCAAGCCT 60 
TGTACCTGCA AGCTCTTGGC CTAACCATTC CTCTGTCCTC TTCTCTGGCC TGCCTGGGGA 120 
GCCCGTGAAG GCCGCACGGG TGCCTCCAGC CTGAGACATC AGGGGAGAGC CTGCAGCTGA 18 0 
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GTTCAGCAGA AAGGAGGAAT CCTGGCCCTC AGGAAGAAGA TAGTCACATG TTTTTCTTCC 240 

TTGTCCCCAC AGCCCCCAGA ACAACATTCT CCCTGCTGGC AGCCCTTCCA TGTCTCCAAA 300 

CCTGGGTCAG AGTGAAAGGA CCTTTGGGGG TGGGTGGGAG CAAAGGGCCC ACCTGCTGGT 360 

TGGTGAAAGC AGTGGTGCCG GAGTGCTAGG TACCGCACGA GTAGTGGTGC GGGGGCTTGG 4 20 

GAAGCAGACC AGGGTTGGAC AAAACCCCAT GAGGGCGGGG AGCTGGAAGA AAAGTCTCTT 4 80 

GGGGACCTCT GGGGCAAGGA GCTGAGAAGT CCTGCAGCAC CAGGTGAGAC TTGCTTACAG 54 0 

TGGATGCCAC TTCTAGGCCT CTGGACCGCA GATGCCCTCC TCCCTCCTGC ACACCTGGCC 600 

TCCTGGGCCT CCAGGTAAAG AGAGAGAGCC AGCCCAGCCC TGTTTCCCCT CAGTCCTCCT 6 60 

TTGCTCCTGC TGCTTCTCCC AACAGCCCAC TGTTAGGAGG TAGTAGACCC CAGCCTCAAG 72 0 

GCTCTGACCT TCTTCATGTG GGCACAGAGG GTCCTGACAC TCTGGCAGGG CCTGAGCTGG 7 80 

GGCAGGCCTC CCTCAGGGCC AGGGGCGATG GCACCCCGGG GACAGGCAGA CCTCCTTCCT 840 

GCCGTCAGCA CCCCCTTCCT TATCACTGTC TGGTCTCCGA GCTTCGGCTG CAGCCTGAGG 900 

TGTGTCCTGG GCTCCTCAGA GCCTGAAGCA AGCTTTTGGA AGCCTGCAGT CCTCCCAGCT 960 

CCAGTGCAGA AGCCTCTCTC TCCAGCCTTT CCCCAGGCAG GAGTTGGGGT TGGGGGCCTC 1020 

TGTCCCTCAT CGCTTACCTT GGAAAGGTGG GAAGCTGGCA ATCTGCACCT TGGGGCCTGG 1080 

GCTCCCCCTC TCTGTGCCAG CGGCTTCCCA GCACCTGGGA GGGGCTGCAG CCCCAGCTGG 114 0 

ACTCCAGCCT GTCCCTCTTA GCACTCTAGC TGCCCACTCC AGGGCAGGGA CTCGAAACCC 1200 

CCTCCGTCCT GAGCAGCCAC CTCCAGGGCC CTGTTTGGGA CCACTCTCTC AGTCCCCAGG 12 60 

TCCTCAGGGC CCCAGAGCGG GAGGGTCTCC TACCTGGAAG TCCCCCTGAG CTCCAGGGCC 1320 

CAGCCCTACC TGCCAGTGCT GGTGTCAGGG CACTCAACAC CGAGTGTGGG GGCCACGCCC 1380 

CTTGCCATGC CCACGGCCTC CTCCTGTAGC CCCTGCCTGC ACCCACGATG CTGCACGGGC 14 40 

CCGCCCTGGT GGGGCTCGGC GAGTAATGTG TTTTGTCCCC AGTTAACCAC CATTCTGCGG 1500 

CCTGGTTCTG CAAGGAACCA GGGCTGCCCC ACCGCCCGCC GTCTGCCGCC CTAGGCTTCC 15 60 

TGACTCCATT AGTTCCGACA CTTGTGAAAC TCCGAGAAGT GCTGTGGTCT CAGCAATGCA 1620 

CCTGTTTTGT ACATGATTGT GTAATTTAAA GGTATATAAA TACAAATATA TATATATATC 1680 

AGTTGTGATT GTATGACTGT GGATAAAATC CAGAACTGTG TCAACCTGAA AAAAAAAA 1738 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2100 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: KIDNNOT09 

(B) CLONE: 1418517 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 : 
GGGAAAGCGG CGAGTAAGAT GGAAGATGAG GAGGTCGCTG AGAGCTGGGA AGAGGCGGCA 60 
GACAGCGGGG AAATAGACAG ACGGTTGGAA AAAAAACTGA AGATCACACA AAAAGAGAGC 120 
AGGAAATCCA AATCTCCTCC CAAAGTGCCC ATTGTGATTC AGGACGATAG CCTTCCCGCG 18 0 
GGGCCCCCTC CACAGATCCG CATCCTCAAG AGGCCCACCA GCAACGGTGT GGTCAGCAGC 24 0 
CCCAACTCCA CCAGCAGGCC CACCCTTCCA GTCAAGTCCC TAGCACAGCG AGAGGCCGAG 300 
TACGCCGAGG CCCGGAAGCG GATCCTGGGC AGCGCCAGCC CCGAGGAGGA G C AG GAG AAA 360 
CCCATCCTCG ACAGGCCAAC CAGGATCTCC CAACCCGAAG ACAGCAGGCA GCCCAATAAT 4 20 
GTGATCAGAC AGCCTTTGGG TCCTGATGGG TCTCAAGGCT TCAAACAGCG CAGATAAATG 480 
CAGGCAAGAA AAGATGCCGC CGTTGCTGCC GTCACCGCCT CCTGGGTCGT CCGCCACGGG 54 0 
TTGCACTGCC GTGGCAGACA GCTGGACTTG AGCAGAGGGA AC G AC C T G AC TTACTTGCAC 60 0 
TGTGATCCCC CTTGCTCCGC CCACTGTGAC CTTGAACCCC ATGCACTGTG ACCTCCCCCC 6 60 
TTCTCCCCCT TCCCACTGTG ATTGGCACAT CGACAAGGGC TGTCCCAAGT CAATGGAAAG 72 0 
GGAAAGGGTG GGGGTTAGGG GAAGGTTGGG GGGACCCAGC AAGGACTCAG AGAGTCAGAC 780 
AGTGCCACTT GGCCACTTGG GGTAAAGCCA GTGCCAGCAA TAACAGTTTA TCATGCTCAT 840 
TAATTTGGGA TTTCAAAACA CAAATGAAAA CTCACACCCA CCCACCCCCA AGTGCATGTC 900 
TCCATCACTT AAAAAGTAAG TTCCATTTGA AAATATCCTT TCTTTTTTTT TTCTTCCTAT 960 
TTTTGTTTGT TTATACAAAT ATCTGATTTG CAAGAAAAAG TGCATGGGAG GGGTTTTAGT 1020 
GGTTTAATGA ATTTTTAATT AAGAAAGGGT AGTTTGGTAG TCTACTTAAA AATGTTTCTG 1080 
GGAAATTCAC TAGAAACATT AACCAATAGG ATTTTGGTGA GCTTAGCTTC TGTATTCCTA 114 0 
CTGCCGCCCA GAAAAGGGGC AGGGCTCTGC AGCCGCCAGG ACAGACGAGC ACCCCATGCC 12 00 
TATACCTCCC TCCCCGAGCT AAGTCCCAGG GCATCTGGGC CTTGCCTGGA GACTGGGCTA 12 60 
GCTCTGTAGG CTCGGAGAGC CTGGGGAGGG TGCCAACCCC ACCTCTAGTA TTTTGGGAGA 1320 
TAGGGAAAGT GAACCGACTT CCCCTTCCCA TACCCCTCAG GGTGGTTCCC TACCAGCCAG 138 0 
GCTTACTACT TCTAGAAGAA AGCAGAGTGC CAGGGAGTGA GATTGCATCC CTGGGCTTAG 14 4 0 
AAGTGACGGA GAGAAGACTT GTTTAGTATT TTGCCATCAG CACAAGGAAA ACCAGGAGAG 1500 
AGTCTGCCTC CAGGACTCTG AGCCTTCTGC CTCGTATGTT CAGAAGGTGG ATAGGTCTTC 1560 
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CCACTCCAGC ATGGCTTGAA CTCTTAGGGG 
CAGCTCAGTA ACT AT ACCTG GTACATTTCC 
ATTCTGAATA AAGTTGGAAA AAGAACAGCT 
TCCTCAGAGG CCTAGGCTAC CCGTCACCCC 
CAGTTTATCC TCTGTCCCTG GAGCCTGGGG 
TAAGAGGGCA GCTGCCCAGA GCAGCTGTGT 
CTTCCATTGC ACTGCGCCTT ATCCCTCAGC 
CAGATACGTT TCGGAGTGGT TGGTGTGGTT 
GGTGGGCAAA GCTGAGTCTC ACAAGGCTCA 



TCTGCAGTGC TCCATCTCCA TTGGTGGCCC 1620 

TGTGTGCAAT CAGTACCTTG AAGGCAGAAC 168 0 

TTGCTTTGCA AAGATTGATG ACAGACTGGT 1740 

TTTTTCCAGA GCGAGGGCCT GGAATGAAGG 18 00 

TTTGCTTTGG CTCCTTGAGG TGGAAGAGAC 18 60 

GTACCTGGCT CCTCTCAGGC TTCCTGATCC 1920 

CAGCCAGACA GCCTCCCTGC TCCTGACCAG 1980 

TTTGTGATGA GGGCAGCACA TGGTGGCCAA 20 4 0 

AATCCCTTCG GTTGGGNTCC CCTTGTGGGG 2100 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(vii) IMMEDIATE SOURCE: 

( A ) LI BRARY : PANCNOT 0 8 

(B) CLONE: 1438165 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95 : 
GCGGGCGGAG ATGTAGACCC GGTAGTGTTG TGCCTTGTGG TGACAACTGG CGGCAGCGCG 60 
CCGCGGGCCC GAGACTTAGT CTCGGGCCGC CATGGCCAGC G T C C AC GAGA GCCTCTACTT 120 
CAATCCCATG ATGACCAATG GGGTTGTGCA CGCCAATGTG TTCGGCATCA AGGACTGGGT 18 0 
GACGCCGTAC AAGATCGCGG TGCTGGTGCT GCTGAACGAG ATGAGCCGCA CAGGCGAGGG 24 0 
CGCCGTCAGC CTCATGGAGC GGCGGAGGCT CAACCAGCTG CTCCTGCCCC TGCTGCAGGG 300 
CCCAGATATT ACACTGTCAA AACTTTACAA GTTAATTGAA GAGTCTTGTC CACAGCTGGC 3 60 
AAATTCAGTG CAGATCAGAA TCAAACTGAT GGCTGAAGGC GAGT TGAAGG AT AT GGAACA 4 20 
GTTTTTTGAT GACCTTTCAG ATTCTTTCTC TGGAACTGAA CCAGAGGTTC ACAAAACAAG 4 80 
TGTAGTAGGT TTGTTTCTGC GTCACATGAT CTTGGCCTAC AGTAAGCTTT CTTTCAGCCA 54 0 
AGTGTTTAAA CTGTACACTG CCCTTCAGCA GTACTTCCAG AATGGTGAGA AAAAGACAGT 600 
GGAGGATGCT GATATGGAAC TGACCAGTAG AGATGAGGGT GAAAGAAAAA TGGAAAAAGA 6 60 
AGAACTTGAT GTATCTGTAA G AG AAG AG G A GGTATCTTGC AGTGGGCCTC TGTCCCAAAA 720 
ACAAGCAGAA TTTTTTCTTT CTCAACAGGC TTCTTTGCTA AAGAATGATG AGACTAAGGC 780 
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CCTCACTCCA GCTTCCTTGC AGAAGGAATT AAACAATTTG TTGAAATTTA ATCCTGATTT 84 0 
TGCTGAAGCG CATTATCTCA GCTACTTAAA CAACCTCCGT GTCCAAGATG TTTTCAGTTC 900 
AAC AC AC AG T CTCCTCCATT ATTTTGATCG TCTGATTCTT ACCGGAGCCG AAAGCAAAAG 9 60 
TAATGGGGAA GAGGGCTATG GCCGGAGCTT GAGATACGCC GCTCTGAATC TTGCCGCCCT 1020 
GCACTGCCGC TTCGGTCACT ATCAACAGGC AGAGCTCGCC CTGCAGGAGG CAATTAGGAT 10 8 0 
TGCCCAGGAG TCCAACGATC ACGTGTGTCT CCAGCACTGT TTGAGCTGGC TTTATGTGCT 114 0 
GGGGCAGAAG AGATCCGATA GCTATGTTCT GCTGGAGCAT TCTGTGAAGA AGGCAGTACA 12 00 
TTTTGGGTTA CCGAGAGCTT TTGCTGGGAA GACGGCAAAC AAGCTGATGG ATGCCCTAAA 12 60 
GGACTCCGAC CTCCTGCACT GGAAACACAG CCTGTCAGAG CTCATCGATA TCAGCATCGC 1320 
ACAGAAAACG GCCATCTGGA GGCTGTATGG CCGCAGCACC ATGGCACTGC AACAGGCCCA 138 0 
GATGTTGCTG AG CAT G AAC A GCCTGGAGGC GGTGAATGCG GGCGTGCAGC AGAACAACAC 14 4 0 
AGAGTCCTTT GCTGTCGCAC TCTGCCACCT CGCAGAGCTA CACGCGGAGC AGGGCTGTTT 1500 
TGCTGCAGCT TCTGAAGTGT TAAAGCACTT GAAGGAACGA TTTCCGCCTA ATAGTCAGCA 15 60 
CGCCCAGTTA TGGATGCTAT GTGATCAAAA AATACAGTTT GACAGAGCAA TGAATGATGG 162 0 
C AAAT AT CAT TTGGCTGATT CACTTGTTAC AGGAATCACA GCTCTCAATA GCATAGAGGG 168 0 
TGTTTATAGG AAAGCGGTTG TAT T AC AAGC TCAGAACCAA ATGTCAGAGG CACATAAGCT 17 4 0 
TTTACAAAAA TTGTTGGTTC ATTGTCAGAA ACTGAAGAAC ACAGAAATGG TGATCAGTGT 1800 
CCTACTGTCC GTGGCAGAGC TGTACTGGCG ATCTTCCTCC CCTACCATCG CGCTGCCCAT 18 60 
GCTCCTGCAG GCTCTGGCCC TCTCCAAGGA GTACCGGTTA CAGTACTTGG CCTCTGAAAC 192 0 
AGTGCTGAAC TTGGCTTTTG CGCAGCTCAT TCTTGGAATC C C AG AAC AG G CCTTAAGTCT 198 0 
TCTCCACATG GCCATCGAGC CCATCTTGGC TGACGGGGCT ATCCTGGACA AAGGTCGTGC 204 0 
CATGTTCTTA GTGGCCAAGT GCCAGGTGGC TTCAGCAGCT TCCTACGATC AGCCGAAGAA 2100 
AGCAGAAGCT CTGGAGGCTG CCATCGAGAA CCTCAATGAA GCCAAGAACT ATTTTGCAAA 2160 
GGTTGACTGC AAAGAGCGCA TCAGGGACGT CGTTTACTTC CAGGCCAGAC TCTACCATAC 2220 
CCTGGGGAAG ACCCAGGAGA GGAACCGGTG TGCGATGCTC TTCCGGCAGC TGCATCAGGA 22 8 0 
GCTGCCCTCT CATGGGGTAC CCTTGATAAA CCATCTCTAG AGAGGACATC CCTGCTGGGC 2340 
TGCTGTGCAG AGTATAAGAT TTTGGACTTG TTCATGTCCC CTCTCTCCCT AT AAAT GAT G 24 00 
TATTTGTGAC ACCCTATCTT GTCAATAAAC AGCATTCTGA TTAAAAAAAA AAAAAAAA 2458 



(2) INFORMATION FOR SEQ ID NO: 



96: 
215 



PF-0459 US 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THYRNOT03 

(B) CLONE: 1440381 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96 : 
TGCATGGATG GGATACTGGA TGAATCTTTG CTTGAAACCT GTCCAATTCA GTCACCATTA 60 
CAAGTTTTTG CAGGAATGGG TGGACTGGCT CTTATTGCTG AAAGAC T AC C CATGCTATAT 120 
CCAGAAGTAA TTCAACAGGT GAGTGCTCCA GTTGTAACAT CTACCACTCA GGAAAAGCCG 18 0 
TATGATAGCG ATCAGTTTGA ATGGGTGACC ATTGAACAGT CAGGGGAGTT AGTTTATGAA 24 0 
G C AC C AG AAA CTGTTGCGGC TGAACCTCCA CCTATCAAGT CAGCAGTACA GACCATGTCT 300 
CCCATACCTG CCCATTCTTT GGCTGCTTTT GGATTATTTC TTCGTCTTCC GGGCTATGCG 3 60 
GAAGTGCTAC TGAAAGAGAG AAAACATGCC CAGTGCCTTC TTCGATTGGT ATTGGGAGTG 420 
ACAGATGATG GAGAAGGAAG TCATATTCTT CAATCTCCAT CAGCCAATGT GCTTCCAACC 4 80 
CTTCCTTTCC ACGTCCTTCG TAGCTTGTTT AGCACTACAC CTTTGACAAC TGATGATGGT 54 0 
GTACTTCTAA GGCGGATGGC AT T GGAAAT T GGAGCCTTAC ACCTCATTCT TGTCTGTCTC 600 
TCTGCTTTGA GCCACCATTC CCCACGAGTT CCAAACTCTA GCGTGAATCA AACTGAGCCA 660 
CAGGTGTCAA GCTCTCATAA CCCTACATCA ACAGAAGAAC AACAGT TATA TTGGGCCAAA 72 0 
GGGACTGGCT TTGGAACAGG CTCTACAGCT TCTGGGTGGG ATGTGGAACA AGCCTTAACT 7 80 
AAGCAAAGGC TGGAAGAGGA ACATGTTACC TGCCTTCTGC AGGTTCTTGC CAGTTACATA 840 
AATCCCGTCA GTAGTGCGGT AAATGGAGAA GCTCAGTCAT CTCATGAGAC TAGAGGGCAG 900 
AACAGTAATG CCCTTCCTTC TGTACTTCTC GAGCTTCTCA GTCAGTCCTG CCTCATCCCA 960 
GCCATGTCAT CTTATCTACG AAATGATTCA GTTCTGGACA TGGCAAGACA TGTGCCACTC 1020 
TATCGGGCAC TGCTGGAATT GCTTCGGGCC ATTGCTTCTT GTGCTGCCAT GGTGCCCCTA 1080 
TTGTTGCCCC TTTCTACAGA GAACGGTGAA GAGGAAGAAG AACAGTCAGA ATGTCAAACT 114 0 
TCTGTTGGTA CATTGTTAGC CAAAAT GAAG ACCTGTGTTG ATACCTATAC CAACCGTTTA 1200 
A GAT CTAAAA GGGAAAATGT T AAAAC AG G A GTAAAACCAG ATGCGTCTGA TCAAGAACCA 12 60 
GAAGGACTTA CTCTTTTGGT AC C AG AC AT C CAAAAGACTG CTGAGATAGT TTATGCAGCC 1320 
ACCACCAGTT TGCGGCAAGC AAATCAGGAA AAAAACTGGG TGAATACTCC AAGAAGGCGG 1380 
CTAATGAACC CCAAACCTTT GTCAGTATTA AAGTCACTTG AAGAAAAATA TGTGGCTGTT 14 4 0 
AT GAAGAAAT TACAGTTTGA TACGTTTGAA ATGGTTTCTG AAG AT GAAG A TGGGAAATTG 1500 
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GGATTTAAAG TAAATTACCA CTACATGTCT CAGGTGAAAA ATGCTAATGA TGCGAACAGT 15 60 
GCTGCCAGAG CTCGCCGCCT TGCCCAGGAA GCTGTGACGC TTTCAACCTC ACTGCCTCTG 1620 
TCTTCATCCT CTAGTGTGTT TGTACGCTGT GATGAGGAGC GACTTGATAT CATGAAGGTT 168 0 
CTAATAACTG GTCCAGCGGA CACCCCTTAT GCAAATGGCT GCTTTGAGTT TGATGTGTAT 174 0 
TTTCCTCAAG ATTATCCCAG TTCACCCCCT CTTGTGAATC TAGAGACAAC TGGTGGTCAT 1800 
AGCGTGCGAT TCAATCCAAA CCTTTATAAT GATGGCAAGG TTTGTTTAAG CATCTTAAAC 18 60 
ACGTGGCATG G AAG AC C AG A AGAGAAGTGG AATCCTCAGA CCTCAAGCTT TTTGCAAGTG 1920 
TTGGTGTCTG TCCAGTCCCT T AT AT TAG T A GCTGAGCCTT ATTTTAATGA ACCGGGATAT 198 0 
GAACGGTCTA GAGGCACTCC CAGTGGCACA CAGAGTTCTC GAGAAT AT GA TGGAAACATT 2 04 0 
C G AC AAG C AA CAGTTAAGTG GGCAATGCTA GAACAAATCA GAAACCCTTC ACCATGTTTT 2100 
AAAGAGGTAA TACACAAACA TTTTTACTTG AAAAGAGTTG AGATAATGGC CCAATGTGAG 2160 
GAGTGGATTG CG GAT AT CCA GCAGTACAGC AGTGATAAGC GGGTAGGCAG GACTATGTCT 2220 
CACCATGCAG CAGCTCTCAA GCGTCACACT GCTCAGCTCC GCGAAGAGTT GCTGAAACTT 2280 
CCCTGCCCTG AAGGCTTGGA TCCTGACACT GACGATGCCC CAGAGGTGTG C AG AG C C AC A 234 0 
ACAGGTGCTG AGGAGACTCT AAT GCATGAT CAGGTTAAAC CCAGCAGCAG CAAAGAACTC 2400 
CCCAGTGACT TCCAGTTATG AGCTGCATTG ATGTGGACTT CATAGACACA AAGGCT TCGA 24 60 
AGCACAAGCC AAATATGTCA ATATTTGTAT GTAAGAAACT AATTATGTAA TAGGTAATGA 2520 
AACTGAAACT ATACTATGCC CTTAAGGAGA TCCAGTTTAA TTCAAGGTGA TCTTTTATTT 2580 
ACCTGTACAG GAGTGTAAAC TTTTTTGTGC TTTTATTTTT CAATTGTGAG AACCACTGAT 2 64 0 
TGGTATGTTC AACAAATTTG TGTATACAAA GAAATGGATA AATCACTGCT ATATAAGGGA 27 00 
AACTACCTTA GGAAAGAATG TTTACTGAAT GTTTATTTTA TTTTATTTTT TTTTTACTAT 27 60 
AGAGTGAGGG GTTGTTAACA AAGAATATAT ATTGGTCGTT CTTACAACTA CTATTTAAAG 2 820 
TCAGCAACTT TTCACTGAAT TTGATAGATT TTATGTTTGG GGGTACGAGC TTGTAAAGCT 2 88 0 
CGGGTGCCTN ATGAGTGACC 2 900 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 
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(A) LIB RARY : LUNGNOT 1 4 

(B) CLONE: 1510839 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97 : 
CCGCTGAGAT GTACGAACTT CCGGTTCTCC GGGCAGCTGC CACTGCTGTA GCTTCTGCCA 60 
CCTGCCACGA CCGGGCCTCT CCCTGGCGTT TGGTCACCTC TGCTTCATTC TCCACCGCGC 120 
CTATGGTCCC TCTTGGAGCC AGCGTGGCGG GCCTGGCGGC TCCCGGGTGG TGAGAGAGCG 180 
GTCCGGGAAC GATGAAGGCC TCGCAGTGCT GCTGCTGTCT CAGCCACCTC TTGGCTTCCG 24 0 
TCCTCCTCCT GCTGTTGCTG CCTGAACTAA GCGGGCCCCT GGCAGTCCTG CTGCAGGCAG 300 
CCGAGGCCGC GCCAGGTCTT GGGCCTCCTG ACCCTAGACC ACGGACATTA CCGCCGCTGC 3 60 
CACCGGGCCC TACCCCTGCC CAGCAGCCGG GCCGTGGTCT GGCTGAAGCT GCGGGGCCGC 4 20 
GGGGCTCCGA GGGAGGCAAT GGCAGCAACC CTGTGGCCGG GCTTGAGACG GACGATCACG 4 80 
GAGGGAAGGC CGGGGAAGGC TCGGTGGGTG GCGGCCTTGC TGTGAGCCCC AACCCTGGCG 54 0 
ACAAGCCCAT GACCCAGCGG GCCCTGACCG TGTTGATGGT GGTGAGCGGC GCGGTGCTGG 600 
TGTACTTCGT GGTCAGGACG GTCAGGATGA GAAGAAGAAA CCGAAAGACT AGGAGATATG 660 
GAGTTTTGGA CACTAACATA GAAAATATGG AATTGACACC TTTAGAACAG GAT GAT GAG G 7 20 
AT GAT G AC AA CACGTTGTTT GATGCCAATC ATCCTCGAAG AAGAGAATGT GCCTTTTGAT 780 
GAAAGAACTT TATCTTTCTA CAATGAAGAG TGGAATTTCT ATGTTTAAGG AATAAGAAGC 84 0 
CACTATATCA ATGTTGGGGG GGTATTTAAG TT AC AT AT AT TTTAACAACC TTTAATTTGC 900 
TGTTGCAATA AATACCGTAT CCTTTTATTA TAT CTT TATA TGTATAGAAG TACTCTATTA 960 
ATGGGCTCAG AGATGTTGGG GATAAAGTAT ACTGTAATAA TTTATCTGTT TGAAAATTAC 1020 
TATAAAACGG TGTTTTCTGA TCGGTTTTTG TTTCCTGCTT ACCATATGAT TGTAAATTGT 1080 
TTTATGTATT AATCAGTTAA TGCTAATTAT TTTTGCTGAT GTCATATGTT AAAGAGCTAT 114 0 
AAATTCCAAC AACCAACTGG TGTGTAAAAA TAATTTAAAA TTTCCTTTAC TGAAAGGTAT 12 00 
TTCCCATTTT TGTGGGGAAA AGAAGCCAAA TT TAT TACT T TGTGTTGGGG TTTTTAAAAT 12 60 
ATTAAGAAAT GTCTAAGTTA TTGTTTGCAA AACAATAAAT AT GAT TT TAG 1310 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(via) IMMEDIATE SOURCE; 

(A) LIBRARY: SPLNNOT04 

(B) CLONE: 1534876 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98 : 
CCATGCTCCA GGCATACAGA TGTGGTTTCT CGGCTGCACC GGGCCAGGCT GCGGGTGTGC 60 
AGGCGTCTGC AAAGTTGTGC CATGTATCAG CACAGGCTTT GAGACGTCTG GACCCTGTCC 120 
TTCCTCCCGT GAGGGGTTCT TGTTCTTTCT GACTCAGGTG ACTTTTCAGC CCTTCCAATT 180 
CCCCTCTTTT TCTGCCCTCC CCTCCAACTC AGCCAACCCA GGTGTGGGCA GTCAGGGAGG 24 0 
GAGGGAGTGT CCCACCACGT TCTCAGGGCA GCCCTTGACT CCTAAGCCCC TTCCTCCTTC 300 
CATTCTGCAT CCCCTCCCCA TCCAACCTAA ATGCCCACAG CTGGGGCTGA GCTGTATTCC 360 
TGTGGAGGGA CCTCTGCCGT GCCTCTCTGA GGTCAGGCTG TGCTGTGTGA TGGGCAGGCT 4 20 
TTGCCCCAGC CCACCCCTGG CAAGGTGCAC TTGTTTTCTG GTTTGTACAA GGTGTCCTGG 4 80 
GGGCCCGTCG CTTCCCTGCC AG T GAG GAG T GACTTCTCCC TCTCTTCCAG TCCTGTAGGG 54 0 
GAGACAAAAC CAGATTGGGG GGCCCAAGGG GAGCATGGAA AAGGCCGGCT CCCCTGTCTT 600 
TCCTTGGCTG TCAGAGTCAG GGTAACACAC ACCAAGAGTG GAGTGCGGCC AGCAAGTTTG 660 
AGACCTGCCC GCCCTCCTCG CAGCTCTGCT CTGTGTCCTC AGGAAGTCAC AGAGTCTACT 720 
GAGGCAAGGA GAGGGTGATT CTTTCCCCAA ATCCCTTCTT CCCTGGTTCC CAAACCAAAG 78 0 
ACAGCCTGCA GCCCTTTCTG CATGGGGTGC TCTGTTGACA GGC TTCCCAG ATCCCTGAGT 84 0 
CTCTCTTTCC TTCCTCCTCG ATCTTTAGTT GTCCACGGTC AATTCAGTGC TTCCATTGGG 900 
GGACAGTCCC CTCCGGGATG ACCTGATTCA CCTCCAGCCC AGGGAATGGA ATCTAGAGGA 960 
ATACGTGGGG TGGGTCTGGA CAAGGAGCGG CAGGAATCAC CACCCATCTC CAGCTGTGGA 1020 
GCCCTGTGGA GGGGAAGGGG AAGCTTGGGG TTCAGAGGGA ACTCTTCCAG GAGAGGGGTG 1080 
CCCAGCGGAG GTAAAGATGA TAGAGGGTTG TGGGGGGTCT CTAGTTGAAT GTTTTGGCCC 114 0 
AT GACTTTGG AACATGGCTG GCAGCTTCCA GCAGAAGTCA CGCTCCCCAT CCCCCAGGGG 1200 
ACATAGGACC TTTTTCCTGC TTCCTGGTCA CTTTCAAAGA ACTATTTGCG CAATCTGTGG 1260 
GTCTGTGGAT TCACGGGGCT TTCTGTGTGG GTGCTGCAGT TGCTTTTGTC TGCAGCAGCA 1320 
GGACACATCT TTCCTCTTAC TCAGCCCTTT ATGGCCCATG GGGAACTCCG TGGCTCAGGG 1380 
AGAGCTGAAC TCCAGGGGTG TGACCTGGGA CAGGTGGGCC TGAGGTGCCC AGCTCAGGGC 14 4 0 
AGCCAGGTGG CTCATGGGCT GTAGT GAG CC AGCTCCCTGG GGGAAAAGGC TGTGGGCCGT 1500 
TAGGACCATC CTCCAGGACA GGTGACCTCT ATGAGGTCAC CTACGGCTGT GGCCGTGCAG 15 60 
GCCTCCTTCC AGCCCAGAGT GGCCCAGTAG AGCAAGGCAG AC AG T G AC C T CCACCCCCGC 162 0 
AGCCCTCTTA AAAGGCCAGT ACTCTTGGGG GTGGGGGGAG GGTTTAGAAA GCATTTGCCC 168 0 
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ATCTGCCTTT CTTTCCCCCA GCCCCCACCC GCTTTGAATG TAGAGACCCG TGGGCACTTT 17 4 0 

TCCTTTTGTG GTGGGGGGTG CGGAGGAGGT ACCCCCACCC CTGGCACAGC CGCCTGGAAT 18 00 

GCAGGACTGT CACTGCTGTT CGGGTGATGA CCTCGTTGCC AAGCTCCTCC TGTCCCCTTG 18 60 

TTCTGGGGGC AGGCGCTGTG CTTCTGTGAG GTGGTTTAGC TTTTGCTTTC GAAGTGGCCA 1920 

GCTGCGGCCA CCAGGTCTCA GCACAAGAGC GCTTCCTTTG CACAGAATGA GCTTCGAGCT 198 0 

TTGTTCAGAC TAAATGAATG TATCTGGGAG GGGTCGGGGG CACGAGTTGA TTCCAAGCAC 204 0 

ATGCCTTTGC TGAGTGTGTG TGTGCTGGGA GAGTCAGAGT GGATGTAGAG CGCGGTTTTA 2100 

TTTTTGTACT GACATTGGTA AG AG AC T G T A TAGCATCTAT TTATTTAGAT GATTTATCTG 2160 

GTAAATGAGG CAAAAAAATT AT T AAAAAT A CATTAAAGAT GATTTAAAAA AAAGAC CAAA 2220 

AAACCAAGAA ACCCAAAGCC CAAGAATGCG CGTAGCATCC AAAAAAAAAA GG 22 72 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SPLNNOT04 

(B) CLONE: 1559131 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99 : 

GTCAACTTAG CGAGCGCAAC AGGCTGCCGC TGAGGAGCTG GAGCTGGTGG GGACTGGGCC 60 

GCAATGGACA AGCTGAAGAA GGTGCTGAGC GGGCAGGACA CGGAGGACCG GAGCGGCCTG 120 

TCCGAGGTTG TTGAGGCATC TTCATTAAGC TGGAGTACCA GGATAAAAGG CTTCATTGCG 18 0 

TGTTTTGCTA TAGGAATTCT CTGCTCACTG CTGGGTACTG TTCTGCTGTG GGTGCCCAGG 24 0 

AAGGGACTAC ACCTCTTCGC AGTGTTTTAT ACCTTTGGTA ATATCGCATC AATTGGGAGT 300 

ACCATCTTCC TCATGGGACC AGTGAAACAG CTGAAGCGAA TGTTTGAGCC TACTCGTTTG 360 

ATTGCAACTA TCATGGTGCT GTTGTGTTTT GCACTTACCC TGTGTTCTGC CTTTTGGTGG 420 

CATAACAAGG GACTTGCACT TATCTTCTGC ATTTTGCAGT CTTTGGCATT GACGTGGTAC 4 80 

AGCCTTTCCT TCATACCATT TGCAAGGGAT GCTGTGAAGA AGTGTTTTGC CGTGTGTCTT 540 

GCATAATTCA TGGCCAGTTT TAT GAAGCTT TGGAAGGCAC TATGGACAGA AGCTGGTGGA 600 

CAGTTTTGTA ACTATCTTCG AAACCTCTGT CTTACAGACA TGTGCCTTTT ATCTTGCAGC 660 

AATGTGTTGC TTGTGATTCG AACATTTGAG GGTTACTTTT GGAAGCAACA AT AC AT T C T C 720 
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GAACCTGAAT GTCAGTAGCA C AG GAT GAGA AGTGGGTTCT GTATCTTGTG GAGTGGAATC 780 
TTCCTCATGT ACCTGTTTCC TCTCTGGATG TTGTCCCACT GAATTCCCAT GAATACAAAC 840 
CTATTCAGCA ACAGCACATA AGCCTTGGGT GCAAGTGATT CCCAGGTGGC AAAAGGCAGC 900 
CCCATCAGAG ATCACGGGAG CAACAGTAAG GGACAGAGTT TTGGGGTCCA CTTGTCCCTC 960 
AGCATGGAAG CCATCACCGT GGTCCTGCAT AGAGTGAGTC TGCTTCTACT CTGGCATCTG 1020 
AGAACAAGTG ACTCTGCTTT AGACAAGCCC CTGGAGAGGG 1060 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 543 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BLADNOT03 

(B) CLONE: 1601473 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100 : 



GCTCACAGTA 


GCCCGGCGGC 


CAGGGCAATC 


CGACCACATT 


TCACTCTCAC 


CGCTGTAGGA 


60 


ATCCAGATGC 


AGGCCAAGTA 


CAGCAGCACA 


AGGGACATGC 


TGGATGATGA 


TGGGGACACC 


120 


ACCATGAGCC 


TGCATTCTCA 


AGCCTCTGCC 


ACAACTCGGC 


ATCCAGAGCC 


CCGGCGCACA 


180 


GAGCACAGGG 


CTCCCTCTTC 


AACGTGGCGA 


CCAGTGGCCC 


TGACCCTGCT 


GACTTTGTGC 


240 


TTGGTGCTGC 


TGATAGGGCT 


GGCAGCCCTG 


GGGCTTTTGT 


GTAAGTCTGC 


GCTCTGACCT 


300 


GGGGGAGGAT 


CCTGGTTCCA 


AGTTTTTCAG 


TACTACCAGC 


TCTCCAATAC 


TGGTCAAGAC 


360 


ACCATTTCTC 


AAAT GGAAGA 


AAGATTAGGA 


AATACGTCCC 


AAGAGTTGCA 


ATCTCTTCAA 


420 


GTCCAGAATA 


TAAAGCTTGC 


AGGAAGTCTG 


CAGCATGT GG 


CTGAAAAACT 


CTGTCGTGAG 


480 


CTGTATAACA 


AAGCTGGAGC 


ACACAGGTGC 


AGCCCTTGTA 


CAGAACAATG 


GAAATGGCAT 


540 


GGA 












543 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2281 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT12 

(B) CLONE: 1615809 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 : 
AGCTGGCTCA CCTTCCAGAT TCACCTGCAG GAGCTGCTGC AGTACAAGAG GCAGAATCCA 60 
GCTCAGTTCT GCGTTCGAGT CTGCTCTGGC TGTGCTGTGT TGGCTGTGTT GGGACACTAT 120 
GTTCCAGGGA TTATGATTTC CTACATTGTC TTGTTGAGTA TCCTGCTGTG GCCCCTGGTG 180 
GTTTATCATG AGCTGATCCA GAGGATGTAC ACTCGCCTGG AGCCCCTGCT CATGCAGCTG 24 0 
GACTACAGCA TGAAGGCAGA AGCCAATGCC CTGCATCACA AACACGACAA GAGGAAGCGT 300 
CAGGGGAAGA ATGCACCCCC AGGAGGTGAT GAGCCACTGG CAGAGACAGA GAGTGAAAGC 360 
GAGGCAGAGC TGGCTGGCTT CTCCCCAGTG GTGGATGTGA AG AAAAC AG C ATTGGCCTTG 4 20 
GCCATTACAG ACTCAGAGCT GTCAGATGAG GAGGCTTCTA TCTTGGAGAG TGGTGGCTTC 480 
TCCGTATCCC GGGCCACAAC TCCGCAGCTG ACTGATGTCT CCGAGGATTT GGACCAGCAG 54 0 
AGCCTGCCAA GTGAACCAGA GGAGACCCTA AGCCGGGACC TAGGGGAGGG AGAGGAGGGA 600 
GAGCTGGCCC CTCCCGAAGA CCTACTAGGC CGTCCTCAAG CTCTGTCAAG GCAAGCCCTG 660 
GACTCGGAGG AAGAGGAAGA GGATGTGGCA GCTAAGGAAA CCTTGTTGCG GCTCTCATCC 7 20 
CCCCTCCACT TTGTGAACAC GCACTTCAAT GGGGCAGGGT CCCCCCAAGA TGGAGTGAAA 780 
TGCTCCCCTG GAGGACCAGT GGAGACACTG AGCCCCGAGA CAGTGAGTGG TGGCCTCACT 840 
GCTCTGCCCG GCACCCTGTC ACCTCCACTT TGCCTTGTTG GAAGTGACCC AGCCCCCTCC 900 
CCTTCCATTC TCCCACCTGT TCCCCAGGAC TCACCCCAGC CCCTGCCTGC CCCTGAGGAA 960 
GAAGAGGCAC TCACCACTGA GGACTTTGAG TTGCTGGATC AGGGGGAGCT GGAGCAGCTG 1020 
AATGCAGAGC TGGGCTTGGA GCCAGAGACA CCGCCAAAAC CCCCTGATGC TCCACCCCTG 10 80 
GGGCCCGACA TCCATTCTCT GGTACAGTCA GACCAAGAAG CTCAGGCCGT GGCAGAGCCA 114 0 
TGAGCCAGCC GTTGAGGAAG GAGCTGCAGG CACAGTAGGG CTTCTTGGCT AGGAGTGTTG 1200 
CTGTTTCCTC CTTTGCCTAC CACTCTGGGG TGGGGCAGTG TGTGGGGAAG CTGGCIGTCG 12 60 
GATGGTAGCT ATTCCACCCT CTGCCTGCCT GCCTGCCTGC TGTCCTGGGC ATGGTGCAGT 1320 
ACCTGTGCCT AGGATTGGTT TTAAATTTGT AAATAATTTT CCATTTGGGT TAGTGGATGT 1380 
GAACAGGGCT AGGGAAGTCC TTCCCACAGC CTGCGCTTGC CTCCCTGCCT CATCTCTATT 144 0 
CTCATTCCAC TATGCCCCAA GCCCTGGTGG TCTGGCCCTT TCTTTTTCCT CCTATCCTCA 1500 
GGGACCTGTG CTGCTCTGCC CTCATGTCCC ACTTGGTTGT TTAGTTGAGG CACTTTATAA 15 60 
TTTTTCTCTT GTCTTGTGTT CCTTTCTGCT TTATTTCCCT GCTGTGTCCT GTCCTTAGCA 1620 
GCTCAACCCC ATCCTTTGCC AGCTCCTCCT ATCCCGTGGG CACTGGCCAA GCTTTAGGGA 1680 
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GGCTCCTGGT 


CTGGGAAGTA 


AAGAGTAAAC 


CTGGGGCAGT 


GGGTCAGGCC 


AGTAGTTACA 


1740 


CTCTTAGGTC 


ACTGTAGTCT 


GTGTAACCTT 


CACTGCATCC 


TTGCCCCATT 


CAGCCCGGCC 


1800 


TTTCATGATG 


CAGGAGAGCA 


GGGATCCCGC 


AGTACATGGC 


GCCAGCACTG 


GAGTTGGTGA 


1860 


GCATGTGCTC 


TCTCTTGAGA 


TTAGGAGCTT 


CCTTACTGCT 


CCTCTGGGTG 


ATCCAAGTGT 


1920 


AGTGGGACCC 


CCTACTAGGG 


TCAGGAAGTG 


GACACTAACA 


TCTGTGCAGG 


TGTTGACTTG 


1980 


AAAAATAAAG 


TGTTGATTGG 


CTAGAACTGC 


TGCCTCCCTG 


ACTGTGAGCT 


GCCTTCCACA 


2040 


CCCTGCACTG 


CACTGTGTTC 


TCTCCTCACC 


CTTAACCTGC 


TTCACTCCAG 


TCTGTTCTGG 


2100 


CTGTTTATTA 


CCTTGTTGCA 


AAACAGGGCC 


GAAGCAAGGA 


TTACCTTGAC 


AACCCTAGCT 


2160 


TCTCCTTAGC 


CATCTTCCTT 


GACAGTGTGA 


TCTGTTTAGT 


GAGATTTAGC 


ATGTGTGAAT 


2220 


AAAGTATATG 


C AG GAG G AAA 


TTGCTTTGTC 


TTCCCAATCG 


GTAGAAATTC 


GAGACCTAGC 


2280 


C 
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(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 992 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNNOTl 9 

(B) CLONE: 1634813 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 : 



GACAGCTTGG 


CCTACAGCCC 


GGCGGGCATC 


AGCTCCCTTG 


ACCCAGTGGA 


TATCGGTGGC 


60 


CCCGTTATTC 


GTCCAGGTGC 


CCAGGGAGGA 


GGACCCGCCT 


GCAGCATGAA 


CCTGTGGCTC 


120 


CTGGCCTGCC 


TGGTGGCCGG 


CTTCCTGGGA 


GCCTGGGCCC 


CCGCTGTCCA 


CGCCCAAGGT 


180 


GTCTTTGAGG 


ACTGCTGCCT 


GGCCTACCAC 


TACCCCATTG 


GGTGGGCTGT 


GCTCCGGCGC 


240 


GCCTGGACTT 


ACCGGATCCA 


GGAGGTGAGC 


GGGAGCTGCA 


ATCTGCCTGC 


TGCGATATTC 


300 


TACCTCCCCA 


AGAGACACAG 


GAAGGTGTGT 


GGGAACCCCA 


AAAGCAGGGA 


GGTGCAGAGA 


360 


GCCATGAAGC 


TCCTGGATGC 


TCGAAATAAG 


GTTTTTGCAA 


AGCTCCGCCA 


CAACACGCAG 


420 


ACCTTCCAAG 


CAGGCCCTCA 


TGCTGTAAAG 


AAGTTGAGTT 


CTGGAAACTC 


CAAGTTATCA 


480 


TCATCCAAGT 


TTAGCAATCC 


CATCAGCAGC 


AGCAAGAGGA 


ATGTCTCCCT 


CCTGATATCA 


540 


GCTAATTCAG 


GACTGTGAGC 


CGGCTCATTT 


CTGGGCTCCA 


TCGGCACAGG 


AGGGGCCGGA 


600 


TCTTTCTCCG 


ATAAAACCGT 


CGCCCTACAG 


ACCCAGCTGT 


CCCCACGCCT 


CTGTCTTTTG 


660 
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GGTCAAGTCT TAATCCCTGC ACCTGAGTTG GTCCTCCCTC TGCACCCCCA CCACCTCCTG 720 
CCCGTCTGGC AACTGGAAAG AGGGAGTTGG CCTGATTTTA AGCCTTTTGC CGCTCCGGGG 7 80 
ACCAGCAGCA ATCCTGGGCA GCCAGTGGCT CTTGTAGAGA AGACTTAGGA TACCTCTCTC 84 0 
ACTTTCTGTT TCTTGCCGTC CACCCCGGGC CATGCCAGTG TGTCCCTCTG GGTCCCTCCA 900 
AAACTCTGGT CAGTTCAAGG ATGCCCCTCC CAGGCTATGC TTTTCTATAA CTTTTAAATA 960 
AACCTTGGGG GTTGATGGAG TCAAAAAAAA AA 992 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1554 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UTRSNOT06 

(B) CLONE: 1638407 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 : 

TCGCCCAGGA GTCATCGGAC GCCAGAATCT GTGTCTCCAG AACGCTATAG CTATGGCACC 60 

TCCAGCTCTT CAAAGAGGAC AGAGGGTAGC TGCCGTCGCC GTCGGCAGTC AAGCAGTTCT 120 

GCAAATTCTC AGCAGGGTCA GTGGGAGACA GGCTCCCCCC CAACCAAGCG GCAGCGGCGG 180 

AGTCGGGGCC GGCCCAGTGG TGGTGCCAGA CGGCGGCGGA GAGGGGCCCC AGCCGCACCC 24 0 

CAGCAGCAGT CAGAGCCCGC CAGACCTTCC TCTGAAGGCA GGTGACACTG TGATGGGGAA 300 

ACAGGCTCAG AG AGAC AT C C GGCTCCGGGT TCGAGCAGAG TACTGCGAGC ATGGGCCAGC 3 60 

CTTGGAGCAG GGCGTGGCAT CCCGGCGGCC CCAGGCGCTG GCGCGGCAGC TGGACGTGTT 4 20 

TGGGCAGGCC ACCGCAGTGC TGCGCTCAAG GGACCTGGGC TCTGTGGTTT G T G AC AT C AA 480 

GTTCTCAGAG CTCTCCTATC TGGACGCCTT CTGGGGCGAC TACCTGAGTG GCGCCCTGCT 54 0 

GCAGGCCCTG CGGGGCGTGT TCCTGACTGA GGCCCTGCGA GAGGCTGTGG GCCGGGAGGC 600 

TGTTCGCCTG CTGGTCAGTG TGGATGAGGC TGACTATGAG GCTGGCCGGC GCCGCCTGTT 660 

GCTGATGGCG GAGGAAGGGG GGCGGCGCCC GACAGAGGCC TCCTGATCCA GGACTGGCAG 720 

GATTGATCCC ACCTCCAAGT CTCCGGGCCA CCTTCTCCTG GGAGGACGAC CATCTCTACC 7 80 

CCTAGAGGAC TGTCACTCTA GCATCTTTGA GGACTGCGAC AGGACCGGGA CAGCAGGCCC 84 0 

CTTGACAGCC CCTCCCACAG GATGTGGGCT CTGAGGCCTA AACCATTTCC AGCTGAGTTT 900 

CCTTCCCAGA CTCCTCCTAC CCCCAGGTGT GCCCCCTTAG CCTCCGGAGG CGGGGGCTGG 960 
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GCCTGTATCT CAGAAGGGAG GGGCACAGCT ACACACTCAC CAAAGGCCCC CCTGCACATT 1020 
GTATCTCTGA TCTTGGGCTG TCTGCACTGT CACAGGTGCA CACACTCGCT CATGCTCACA 1080 
CTGCCCCTGC TGAGATCTTC CCTGGGCCTC TGCCCTGGCC TGCTTCCCAG CACACACTTC 114 0 
TTTGGCCTAA GGGCTTCTCT CTCAGGACCT CTAATTTGAC CACAACCAAC CTGGGCTTCA 1200 
GCCACATCAG TGGGCACTGG AGCTGGGGTG CACATGGGGC CTGCTCACCT TGCCCACACA 1260 
TCTCCAGCCA GCCAGGGCCC TGCCCAGCTT CAATTTACAG ACCTGACTCT CCTCACCTTC 1320 
CCCCCTGCTG TCCAGAGCTG AACATAGACT TGCACTTGGA TGTCACCTGG AGTGTCACAT 1380 
GGGAGTGTTA TGGCAGCATC ATACCAAGGC CTACTGTTGC ACATGGGGCC AAAACCAGTA 144 0 
AACAGCCACC TTCTTGGAAA GGGAATGCAA AGGCTTTGGG GGTGATGGAA AAGACCTTTT 1500 
ACAAAT GAT A CCAATTAAAC TGCCCTGGAA AGGGCATAGG TGGGAAAAAA AAAA 155 4 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1802 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSTUT08 

(B) CLONE: 1653112 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 : 

GTCGCCGGGC TTGCGATGAA CTTCCGGCTG TCAAGCTCCC GGCCGGGCTG ACTCAAGCGG 60 

AGGCGCGCGG AACAGTCGCC GAGGCGATTC CCGCCCAGGC TCCTGTAACC GCCAGGCAGC 120 

GGCCCCGCCA TGTCCCAGCC CCGGACCCCA GAGCAGGCAC TGGATACACC GGGGGACTGC 18 0 

CCCCCAGGCA GGAGAGACGA GGACGCTGGG GAGGGGATCC AGTGCTCCCA ACGCATGCTC 240 

AGCTTCAGTG ACGCCCTGCT GTC CAT CATC GCCACCGTCA TGATCCTGCC TGTGACCCAC 300 

ACGGAGATCT CCCCAGAACA GCAGTTCGAC AGAAGTGTAC AGAGGCTTCT GGCAACACGG 3 60 

ATTGCCGTCT ACCTGATGAC CTTTCTCATC GTGACAGTGG CCTGGGCAGC ACACACAAGG 4 20 

TTGTTCCAAG TTGTTGGGAA AAC AG AC G AC ACACTTGCCC TGCTCAACCT GGCCTGCATG 4 80 

AT G AC CATC A CCTTCCTGCC TTACACGTTT TCGTTAATGG TGACCTTCCC TGATGTGCCT 54 0 

CTGGGCATCT TCTTGTTCTG TGTGTGTGTG ATCGCCATCG GGGTCGTGCA GGCACTGATT 600 

GTGGGGTACG CATTCCACTT CCCGCACCTG CTGAGCCCGC AGATCCAGCG CTCTGCCCAC 660 

AGGGCTCTGT ACCGACGACA CGTCCTGGGC ATCGTCCTCC AAGGCCCGGC CCTGTGCTTT 7 20 
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GCAGCGGCCA TCTTCTCTCT CTTCTTTGTC 
ATCCTCCTCC CCTATGTCAG CAAGGTCACC 
AGGGAGCCCT CGGCTCACCC AGTGGAAGTC 
AAGGAGCGCG TGGAAGCCTT CAGCGACGGA 
CTGGACATCT GCGAAGACAA CGTCCCGGAC 
AGCCTCGTGG CCGCCCTGAG TGCGACCGGG 
GCCACAGTGG GACTGCTGTG GTTCGCCCAC 
ACGCGGGCCA TGGGGCTGCT GAACACGCTC 
GCCTACCAGC AGACCTCGGC CTTCGCCCGG 
GTCAGCTGCA CCATCATCTT CCTGGCCAGC 
CTGCTGCACC AGGCGGAGAC GCTGCAGCCC 
CTCATGTTCG CCAAGCTGGC GCTGTACCCC 
TGCCTGCTGA GCAGGTTCAG TGTGGGCATC 
GCCTTCCTGT TGCTGCGCCT GCTCGTGGGC 
GGCCTCGCCC GGCCCGAACA CCCCCCGCCA 
CAGCTCCTCC CTGCCCCCTG CTAGCAGCCA 
GATGGACCAG GGAGGACAGG ATGCTGGGCA 
GGTTCTTGCG TGGCCTGGTT TTATTTTCAT 
CA 



CCCTTGTCTT ACCTGCTGAT GGTGACTGTC 7 80 
GGCTGGTGCA GAGACAGGCT CCTGGGCCAC 8 40 
TTCTCGTTTG ACCTCCACGA GCCACTCAGC 900 
GTCTACGCCA TCGTGGCCAC GCTTCTCATC 960 
CCCAAGGATG TGAAGGAGAG GTTCAGCGGC 1020 
CCGCGCTTCC TGGCGTACTT CGGCTCCTTC 1080 
CACTCACTCT TCCTGCATGT GCGCAAGGCC 1140 
TCGCTGGCCT TCGTGGGTGG CCTCCCACTA 1200 
CAGCCCCGCG ATGAGCTGGA GCGCGTGCGT 12 60 
ATCTTCCAGC TGGCCATGTG GACCACGGCG 132 0 
TCGGTGTGGT TTGGCGGCCG GGAGCATGTG 1380 
TGTGCCAGCC TGCTGGCCTT CGCCTCCACC 14 4 0 
TTCCACCTCA TGCAGATCGC CGTGCCCTGC 1500 
CTGGCCCTGG CCACCCTGCG GGTCCTGCGG 15 60 
GCCCCCACGG GCCAGGACGA CCCACAGTCC 1620 
CAGAGCCCAC TCCCAGCCGT CCTCACCAGA 1680 
GGGGAAGCCA AGTCACGGGC AGGCCGCAGT 17 4 0 
TGTGAAATAT CATGCTCTTA TTTCAGTCCT 1800 

1802 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1395 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT09 

(B) CLONE: 1664634 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105 : 

GTACCTCGGC TTATTTCATA AACAGGTACT GAAGGAAGCA GAGGCATGTG GAGGACTTCC 60 

CCACCTCGTG CAGCTATTTG GGCCGTGGCA TCTGAAATTT CTTATTTCAG AGTCACCCCT 12 0 

TTGATGACCT TGGCAGTGAA CTGCAGTCAT CTGTTTAGGC CTTTCCATGG CCCACGTCAA 18 0 
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TGCCGGTATT TCTGTTTGTT GCACATTTGA 
CGAGCCGCAC TGAGGGACTG AGCCTGGTGT 
CAGCAGTCTC CAGAGCATTC CATGAGATCC 
CTTTGATGGA CTTTGGCTCA GGTACTGGTT 
GCCAGAGCCT ACGTGAATAT ATGTGTGTGG 
AAAAACTACT GACAGGTGGT TCAGAATCTG 
GACAGTTTCT ACCTGTATCA CCCAAGGTGC 
TAAGTGACCA GCTACTGACA TTTATACTTT 
CCTTTTGTGA ACAGGTACTG GTGGAGAATG 
ATGCCAGGGA TCTGGTCCTT AAGGGAAAAG 
TTGTCTTTGC CCCGTGTCCC CATGAACTCC 
GCTTCTCACA GGCGTACCAT CCCATCCCCT 
AGTTCTCTAT GGTGATCCTT GCTCGGGGGT 
TCACTCAGCC TGTCCTTAAA CGGCCTCGCC 
GGCACATGCA GCATGCTGTG CTCACAGCCC 
AAAATCAGTG GGATGTGGCA GGAAGCTGCA 
TTGTATCGTT GTGCCCGTGT CAGCTCCTGG 
GCGTTTCCTC CATCTACGGC TCAGGATCCC 
ATTTTCTTCT ATCGTGCCTG CCAGGGCTGA 
GTATCCCCAT ATGTCTGTGT TTGTTTGAGA 
AATGGAAAAA AAAAA 




TTTCCTTGTT GTTGGCATTT AGAAGGCCCT 24 0 
ATATGGCAGC AAGACTGGAT GGTGGCTTTG 300 
GGGCTCGAAA TCCAGCATTT CAGCCACAAA 360 
CTGTCACCTG GGCTGCTCAC AGTATTTGGG 420 
ACAGATCAGC TGCCATGTTG GTTTTGGCAG 4 80 
GGGAGCCTTA TATTCCAGGT GTCTTTTTCA 54 0 
AGTTTGATGT AGTAGTGTCA GCTTTTTCCT 600 
CGTGTAATTC AAGTCTTCTG CATATTTTCC 660 
GAACAAAAGC TGGGCACAGC CTTCTCATGG 720 
AGAAGTCACC TTTGGACCCT CGACCTGGTT 780 
CTTGTCCCCA GTTGACCAAC CTGGCCTGTA 840 
TCAGCTGGAA CAAGAAACCA AAGGAAGAAA 900 
CTCCAGAGGA GGCTCATCGC TGGCCCCGTA 960 
ATGTGCATTG TCACTTGTGC TGTCCAGATG 1020 
GCCGGCACGG CAGGTATGGG GGGTGTGACC 1080 
GCCCACGCCA GCATCTGTTT CCACAGGGAT 1140 
GGAGATCTTT TACCTGTGCT TACTCCGTCT 1200 
TCTGAGAGTT GATGAGGATG TGTAACAAGT 1260 
AGCTGCCTGG TATCCAGGAG GGGAATGCTG 1320 
TTTTTAATAA TAAATAATAA ATTTTTGAAG 1380 

1395 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1635 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSTUT10 

(B) CLONE: 1690990 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 : 
CCCTCTTCCT TTTGCGCACG GAAGAACAAA TCACAACAAT CACACACCAG GACTGAATCC 60 
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ATCAGCAGAT ACTGCCCTGT GGGAAGGGCA GAGGAAAGAG AAGACAGACG GACTGACAGA 120 
CACCACAGAG GAACAGGGGA GTTAGCCTGG GACCAATGGA GGAGAAGTAC GAACCCTGGG 180 
AAAAAGACGT GTCAGATGAG AAAGTTCCGG AGAGTCCGAT GTCTCATCGC AGGTGTTACA 24 0 
TCAT CAGGGT TTGCCATTGG AATACTGAGT GGAGATGGGA AAGAGAAAAG TTAAGGGCTG 300 
AAATGGGAGG GGAATGGGAA GAAAAAATGA GAGACAAGAG GGAAATAAGA AAAAACAAAG 3 60 
AGAGCACAAA GACCAGTTTA GGAGAAAGGA CCAATGGGGA CAGTGGCAGA GTGGCGAGGT 4 20 
AGGTGAAGGA CTGAGGCACA GCGTCCTGTT GTGGAGGGAG GAAAGGCAAG CGTTCCGAGG 4 80 
TGGTGAAAAG GAAGGCCTGC TAGGCACGGT GGGGATGAAC GAG GAT G CCA TGAGTCACAC 54 0 
AAAAGACAGT GCTGGTGAGG CCCAGCCACA GGAGCCTCAG ATAACTTGGT AAAGGCATGT 600 
CTCCCATTTG GGAACTGATG TTCCTAAGAT CCGCACTGAC GCTGCTCAGC CGGTCCATCA 660 
CACAGCAAAG GCGTGAGGAA GGGTCACTGC CCAGCTGGAC TCCAGGGTGG TCCACGCATG 7 20 
ACAGTCACAC CGAACCTTCA TGAGGATGTG AACTGTTGGC TCCAATTTAC CATTCCCAGC 780 
AATTCCACTC AGATATTTGT ATACTAATGT TCACAGCAGC GTGAACTCCA CAGCAGGTGG 840 
AGTAATGTTC CATTGTGTGC AT AT GCC AC A TTTTGTTTAT CCATTCATCT GTTGATGCAC 900 
ATTTCGGTTG TTCCCACCTT TGGGCTATTA TTAATAATGC TGCTGTGAAC ATTCCCAAGA 9 60 
GAAATAGGAA GACGGCTTTG CTAAGAACTA AAAAAGGGAT GGACAACAAG GGCATATACC 1020 
CAGGGGCAGT GTTCTATCAT GACAGCTTTA CTGAGAGCAG AGTAGTTCTG CTCAGAATCA 108 0 
GAACACTTGT TCCCTATAGC CCCCCTGATT GCCCCACAAC CACCACCGCA TACTCCCCTT 114 0 
TTCCCAACCA TGGGCAGCAG ATTGAGCTAT TAACAGAAGT GTCCTTTCGC TGGATTTCTC 1200 
AACCCTTTCC TCATCGTCCA CATAGAGAAA CAGTAACAGA TTGCTACTCA CCCAACACCC 1260 
AGGTCAAGTC CAATGCAGGT AGGAATAACA GCAAATCCTT CAATTTCTTG ATTCTGCTCT 1320 
TAAAAATCTT AACAGAGGCT TCCAGGTTCT GAAAAT AT T T TCTGCATAAA CGTGTGACAC 1380 
TCCATCACGA AACTCCCTTT GGTTATCTGC TTAAACTTAT CGCAAATGTC TGGAACGCTG 14 4 0 
GTGGCTTCCA AAATCAACTC CTGGTGCTGC TTAATTAAGG TCAGGGCCAC CCGGAAGATA 1500 
ATCTTCGAGC CTTCGTTAAA CAAACAGTCC CAGATCCGAA GCACTGTCTC CACGGGCAAG 1560 
ATGTCCACAA ACAGGCAGAT GAACCAGCGG GACACCAGCA GCGTCCACAG C AC AC C GAGA 1620 
CGCTCCATCA GGGGG 1635 

(2) INFORMATION FOR SEQ ID NO: 107: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1485 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: DUODNOT02 

(B) CLONE: 1704050 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107 : 
TTTTTGGTCC CGNCNAAAGN CCNAAAACCC GGNACCCGGG AAGCCNCCCC AANNCNAAAN 60 
TTCCCAGTTN GAANCCCGAA GGNAAAACCC CGGAAAAGNA NNCNGCCCCN AAANTTCNCG 120 
GGCNAAAACC CGGCCNTTIT TTCCCCCCCG GGCGGCCGTT TTGGGCCCCN GANTTTCCAT 180 
TTAAANTNCC NAGNCTTGGG CAACCTAACC AGGNTTTTCC CCCAANCTGG AAAAAGCCGG 24 0 
GCCAAGTTGA GCCGCACCCG CCCCAGAAGT TCAAGGGCCC CCGGCCTCCT GCGCTCCTGC 300 
CGCCGGGACC CTCGACCTCC TCAGAGCAGC CGGCTGCCGC CCCGGGAAGA TGGCGAGGAG 360 
GAGCCGCCAC CGCCTCCTCC TGCTGCTGCT GCGCTACCTG GTGGTCGCCC TGGGCTATCA 420 
TAAGGCCTAT GGGTTTTCTG CCCCAAAAGA CCAACAAGTA GTCACAGCAG TAGAGTACCA 48 0 
AGAGGCTATT TTAGCCTGCA AAACCCCAAA GAAGACTGTT TCCTCCAGAT TAGAGTGGAA 54 0 
GAAACTGGGT CGGAGTGTCT CCTTTGTCTA CTATCAACAG ACTCTTCAAG GTGATTTTAA 600 
AAATCGAGCT GAGATGATAG ATTTCAATAT CCGGATCAAA AATGTGACAA GAAGTGATGC 660 
GGGGAAATAT CGTTGTGAAG TTAGTGCCCC ATCTGAGCAA GGCCAAAACC TGGAAGAGGA 7 20 
TACAGTCACT C T GGAAGT AT TAGTGGCTCC AGCAGTTCCA TCATGTGAAG TACCCTCTTC 780 
TGCTCTGAGT GGAACTGTGG TAGAGCTACG ATGTCAAGAC AAAGAAGGGA ATCCAGCTCC 8 40 
TGAATACACA TGGTTTAAGG ATGGCATCCG TTTGCTAGAA AATCCCAGAC TTGGCTCCCA 900 
AAGCACCAAC AGCTCATACA CAATGAATAC AAAAACTGGA ACTCTGCAAT TTAATACTGT 960 
TTCCAAACTG GACACTGGAG AATATTCCTG TGAAGCCCGC AATTCTGTTG GATATCGCAG 1020 
GTGTCCTGGG AAACGAATGC AAGTAGATGA TCTCAACATA AGTGGCATCA TAGCAGCCGT 1080 
AGTAGTTGTG GCCTTAGTGA TTTCCGTTTG TGGCCTTGGT GTATGCTATG CTCAGAGGAA 114 0 
AGGCTACTTT TCAAAAGAAA CCTCCTTCCA GAAGAGTAAT TCTTCATCTA AAGCCACGAC 1200 
AATGAGTGAA AATGATTTCA AG C AC AC AAA ATCCTTTATA ATTTAAAGAC TCCACTTTAG 1260 
AGATACACCA AAGCCACCGT TGTTACACAA GTTATTAAAC TATTATAAAA CTCTGCTTTG 1320 
TCCGACATTT GCAAAGAGGT ACACGAGGAA ATGGAATTGG TATTTCATTT TAATTTTCAT 1380 
G AC T AC T AAC TCACCTGAAC TTGCTATTTT AAACAAATAG TTCTGTCGAC ACCTAAAATA 14 4 0 
TAATCTGGCT TCTTGTGTCT GGACTAAGTT AAAAG AAT T A AAATA 14 85 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT16 

(B) CLONE: 1711840 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108 : 







PCTGGTCCGC 


CCGGCCGCGG 


CCGATCTAGG 


GGCTGGGGGC 


60 


TGGAGGCGGG 


GGTGGGGGTC 


TGAGCTGCGT 


CCTGGGCTCG 


AGGCGTCCCC 


CGGGGAGTCG 


120 


CCTCTTAGCG 


GTGCGTCCGG 


GCTAGCGGCG 


AGGGGCCGCC 


CCAAGTCTTC 


CCACCGCCGC 


180 


CACCTTAGCA 


GCCCGACTTG 


GGGCCTGGAA 


AGTGGAGCAC 


GCGGAGGTGG 


GAGGGCCCTG 


240 


CACGCGGCCC 


CCGGTGGGGA 


AGGGGACGGG 


CCAGGGATTC 


AGACTCGGGC 


TCTCCCCTCA 


300 


GGATGCAGCA 


CCGAGGCTTC 


CTCCTCCTCA 


CCCTCCTCGC 


CCTGCTGGCG 


CTCACCTCCG 


360 


CGGTCGCCAA 


AAAGCAAGAT 


AAGG T GAAGA 


AGGGCGGCCC 


GGGGAGCGAG 


TGCGCTGAGT 


420 


GGGCCTGGGG 


GCCCTGCACC 


CCCAGCAGCA 


AAGGATTTGC 


GGCAGTGGGT 


TTTCCGCGAG 


480 


GGCCACCTTG 


GGGGGGCCCA 


AGAACCCAAC 


CGGCAGTCCT 


GGTTGAAAGG 


GTTGCCCCTG 


540 


GAAAGTTGGA 


AAG AAAG GAG 


TTTTGGGCAC 


CCGGACTTTG 


GAAAGTTGGC 


CAAATTTTTT 


600 


G G AAGAAAAC 


TTGGCGGGTC 


TGCCGGTCCG 


TTAAATGGGG 


GAGGGGACAA 


AAGAATTGAA 


660 


AGCCGAAAAA 


ATGCTTTCTC 


CGCCGCCAAG 


AGAGGTCGAA 


CCCGCGTCTG 


GCAAGAAGAG 


720 


AAAAGGGCGC 


GCCCACACTG 


TTAACAACAA 


TATGGCGCCT 


GAACAGTTGG 


TGGCACCACA 


780 


GGGGGAGGGA 


G AC AC AT ACT 


TGCGCGCGGT 
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(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1064 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 : 
TTCCTGGGGC TCCGGGGCGC GGAGAAGCTG CATCCCAGAG GAGCGCGTCC AGGAGCGGAC 60 
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CCGGGAGTGT TTCAAGAGCC AGTGACAAGG ACCAGGGGCC CAAGTCCCAC CAGCCATGCA 12 0 

GACCTGCCCC CTGGCATTCC CTGGCCACGT TTCCCAGGCC CTT GGGACCC TCCTGTTTTT 18 0 

GGCTGCCTCC TTGAGTGCTC AGAATGAAGG CTGGGACAGC CCCATCTGCA CAGAGGGGGT 24 0 

AGTCTCTGTG TCTTGGGGCG AGAACACCGT CATGTCCTGC AACATCTCCA ACGCCTTCTC 300 

CCATGTCAAC ATCAAGCTGC GTGCCCACGG GCAGGAGAGC GCCATCTTCA ATGAGGTGGC 360 

TCCAGGCTAC TTCTCCCGGG ACGGCTGGCA GCTCCAGGTT CAGGGAGGCG TGGCACAGCT 4 20 

GGTGATCAAA GGCGCCCGGG ACTCCCATGC TGGGCTGTAC ATGTGGCACC TCGTGGGACA 4 80 

CCAGAGAAAT AACAGACAAG TCACGCTGGA GGTTTCAGGT GCAGAACCCC AGTCCGCCCC 54 0 

CGACACTGGG TTCTGGCCTG TGCCAGCGGT GGTCACTGCT GTCTTCATCC TCTTGGTCGC 60 0 

TCTGGTCATG TTCGCCTGGT ACAGGTGCCG CTGTTCCCAG CAACGCCGGG AGAAGAAGTT 660 

CTTCCTCCTA GAACCCCAGA TGAAGGTCGC AGCCCTCAGA GCGGGAGCCC AGCAGGGCCT 72 0 

GAGCAGAGCC TCCGCTGAAC TGTGGACCCC AGACTCCGAG CCCACCCCAA GGCCGCTGGC 780 

ACTGGTGTTC AAACCCTCAC CACTTGGAGC CCTGGAGCTG CTGTCCCCCC AACCCTTGTT 84 0 

TCCATATGCC GCAGACCCAT AGCCGCCTGC AAGGAAGAGA GGACACAGGA GTAGCCACCC 900 

TGAGTGCCGA CCTTTGGTGG CGGGGGCCTG GGTCTCTCGT CCCCACCCGG AAGGGCACAA 960 

GACACCGGGC TTTGCTTGGC AAGGCTTGGG GCCTCTTGTG GTCAACCCAG TTCCCTTGGG 1020 

TGCCGTTGCA GAACCCCTTA GCCCCTTCCA ACGTCGACCA GGTT 10 64 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1031 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110 : 

AGTTCCTGCA GGTGCCGGCG GTGACGCGGG CTTACACCGC AGCCTGTGTC CTCATCCACC 60 

GCCGCGGTGC AGCTGGAGCT CCTCAGCCCC TTTCAACTCT ACTTCAACCC GCACCTTGTG 120 

TTCCGGAAGT TCCAGGTGAG GCCGCCTCGC GCCGCGCACC TGGGGCCCGA CCCACCCACC 18 0 

CCGCACCTGA CCGCCCGTCC CCCGTAGGTC TGGAGGCTCG TCACCAACTT CCTCTTCTTC 240 

GGGCCCCTGG GATTCAGCTT CTTCTTCAAC ATGCTCTTCG TGTATCCTGC GCCTGCGGAC 300 

ACGGGCTGGG TGGAGGGCAG GCCGGCCGGG CTGGGAGAGA GGCCGGGACG GGGAAACTGA 360 
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GGCCCCGCCT 


GGTGGCACTT 


CCTATACCGA 


CGCCGTAGGT 


TCCGCTACTG 


CCGCATGCTG 


420 


GAAGAGGGCT 


CCTTCCGCGG 


CCGCACGGCC 


GACTTCGTCT 


TCATGTTTCT 


CTTCGGGGGC 


480 


GTCCTTATGA 


CCGTATCCTT 


CCCGCAGGCT 


CTGGAACCTC 


GGGCTAGGGC 


GCCTCGGCGT 


540 


CCAGCCTGTG 


TTGGTCCTGG 


GGCCAACACA 


GCCATGCCAG 


AGAGGGACAC 


AGTCGCTGTC 


600 


TCCAGCTTAG 


CACCGTTCCT 


GCCTTGGGCG 


CTCATGGGCT 


TCTCGCTGCT 


GCTGGGCAAC 


660 


TCCATCCTCG 


TGGACCTGCT 


GGGGATTGCG 


GTGGGCCATA 


TCTACTACTT 


CCTGGAGGAC 


720 


GTCTTCCCCA 


ACCAGCCTGG 


AGGCAAGAGG 


CTCCTGCAGA 


CCCCTGGCTT 


CCTAAAGCTG 


780 


CTCCTGGATG 


CCCCTGCAGA 


AGACCCCAAT 


TACCTGCCCC 


TCCCTGAGGA 


ACAGCCAGGA 


840 


CCCCATCTGC 


CACCCCCGCA 


GCAGTGACCC 


CCACCCAGGG 


CCAGGCCTAA 


GAGGCTTCTG 


900 


GCAGCTTCCA 


TCCTACCCAT 


GACCCCTACT 


TGGGGCAGAA 


AAAACCCATC 


CTAAAGGCTG 


960 


GGCCCATGCA 


AGGGCCCACC 


TGAATAAACA 


GAATGAGCTG 


CAAAAAAAAA 


AAAAAAGGGC 


1020 


GGCCGTCGCG 


A 










1031 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSTUT12 

(B) CLONE: 1812375 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111 : 
GCTGGATAAG ACACCAGGGG AG T C ACT ACA TGGTTACCGC ATCTGTATCC AGGCCATCCT 60 
GCAAGACAAG CCCAAGATTG CCACGGCAAA CCTAGGCAAG TTCCTGGAAC TGCTGAGGTC 120 
CCACCAGAGC CGACCAGCAA AGTGTCTCAC CATCATGTGG GCCCTGGGTC AAGCAGGTTT 18 0 
TGCCAACCTC ACCGAGGGAC TGAAAGTGTG GCTGGGGATC ATGCTGCCTG TGCTGGGCAT 24 0 
CAAGTCTCTG TCTCCCTTTG C CAT C AC AT A CCTGGATCGG CTGCTCCTGA TGCATCCCAA 300 
CCTTACCAAG GGCTTCGGCA TGATTGGCCC CAAGGACTTC TTCCCACTTC TGGACTTTGC 360 
CTATATGCCG AACAACTCCC TGACACCCAG CCTGCAGGAG CAGCTGTGTC AGCTCTACCC 420 
CCGACTGAAA ATGCTGGCAT TTGGAGCAAA GCCGGATTCC ACCCTGCATA CCTACTTCCC 48 0 
TTCTTTCCTG T C C AG AG C C A CCCCTAGCTG TCCCCCTGAG AT GAAGAAAG AGCTCCTGAG 540 
CAGCCTGACT GAGTGCCTGA CGGTGGACCC CCTCAGTGCC AGCGTCTGGA GGCAGCTGTA 600 
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CCCTAAGCAC CTGTCACAGT CCAGCCTTCT GCTGGAGCAC TTGCTCAGCT CCTGGGAGCA 660 
GATTCCCAAG AAGGTACAGA AGTCTTTGCA AGAAACCATT CAGTCCCTCA AGCTTACCAA 7 20 
CCAGGAGCTG CTGAGGAAGG GTAGCAGTAA CAACCAGGAT GTCGTCACCT GTGACATGGC 7 80 
CTGCAAGGGC CTGTTGCAGC AGGTTCAGGG TCCTCGGGTG CCCTGGACGC GGCTCCTCCT 84 0 
GTTGCTGCTG GTCTTCGCTG TAGGCTTCCT GTGCCATGAC CTCCGGTCAC ACAGCTCCTT 900 
CCAGGCCTCC CTTACTGGCC GGTTGCTTCG ATCATCTGGC TTCTTACCTG CTAGCCAACA 960 
AGCGTGTGCC AAGCTCTACT CCTACAGTCT GCAAGGCTAC AGCTGGCTGG GGGAGACACT 1020 
GCCGCTCTGG GGCTCCCACC TGCTCACCGT GGTGCGGCCC AGCTTGCAGC TGGCCTGGGC 1080 
TCACACCAAT GCCACAGTCA GCTTCCTTTC TGCCCACTGT GCCTCTCACC TTGCGTGGTT 114 0 
TGGTGACAGT CTCACCAGTC TCTCTCAGAG GCTACAGATC CAGCTCCCCG ATTCCGTGAA 1200 
TCAGCTACTC CGCTATCTGA GAGAGCTGCC CCTGCTTTTC CACCAGAATG TGCTGCTGCC 1260 
ACTGTGGCAC CTCTTGCTTG AGGCCCTGGC CTGGGCCCAG GAGCACTGCC ATGAGGCATG 1320 
CAGAGGTGAG GTGACCTGGG ACTGCATGAA GACACAGCTC AGTGAGGCTG TCCACTGGAC 138 0 
CTGGCTTTGC CTACAGGACA TTACAGTGGC TTTCTTGGAC TGGGCACTTG CCCTGATATC 14 4 0 
CCAGCAGTAG GCCCTGCCTT CCTGGCCACT GATTTCTGCA TGGGTAGACC ATCCAAGACT 1500 
GCAGCGGGTA GAAGGTGGCA GTTCTTCATG GGAGTCTTTT TAACTTGGTG CCTGAGTTCT 1560 
CTCCTAGGCA AGTGGCCAGT TGCCTCCACC TCAGTTCTTC CATCTTTGGT GGGGACAGGG 1620 
CCCAGCAGCA TCTCAGCCTC CTACCCACAA TTCCACTGAA CACTTTTCTG GCCCTACTGC 1680 
ACATGGCCCC CAGCCTCCAT CCTTGTGCTG GTAGCCTCTC ACAACTCCGC CCTTGCCCTC 17 4 0 
TGCCTTCCAC TTCCTTCCAT CTCATTTCTA AACCCCAAAC AGCTCATCTC TAAAAAGATA 18 00 
GAACTCCCAG CAGGTGGCTT CTGTGTTCTT CTGACAAATG ATTCCTGCTT CTCCAGACTT 1860 
TAGCAGCCTC CTGTTCCCAT TCTTGGTCAC AGCTCTAGCC ACAGCAGAAG GAAAGGGGCT 192 0 
TCCAGAAGAA TATAGCACCG CATTGGGAAA CAGCAGCCTC ACCTCCACCT GAAGCCTGGG 1980 
TGTGGCTGTC AGTGGACATG GGGAGCTGGA TGGAAATGCC TCTCACTTCA AAATGCCCAG 204 0 
CCTGCCCCAA ATGCCTCTAA GCCCCTCCCT GTCCCCTCCC TTGTAGTCCT ACTTCTTCCA 2100 
ACTTTCCATT CCCCATCATG CTGGGGGTCT TGGTCACAAG GCTCAGCTTC TCTCCACTGT 2160 
CCATCCCTCC TATCATCTGT AGAGCAGAGC ACAGGCAGTT GTGTGCCTTG GGCCCAGGGA 2220 
ACCCTCCATC AACCTGAGAC AGGACTCAGT ATATGGTTCT TGGGTATGCC CTACCAGGTG 2280 
GAATAAAGGA CACAGATTTG AAAAAAAAAA AAAAAA 2316 
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(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1169 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT20 

(B) CLONE: 1818761 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 : 



AGCAAGGAGC 


CAGAGGCCAT 


GCAGTGGCTC 


AGGGTCCGTG 


AGTCGCCTGG 


GGAGGCCACA 


60 


GGACACAGGG 


TCACCATGGG 


GACAGCCGCC 


CTGGGTCCCG 


TCTGGGCAGC 


GCTCCTGCTC 


120 


TTTCTCCTGA 


TGTGTGAGAT 


CCCTATGGTG 


GAGCTCACCT 


TTGACAGAGC 


TGTGGCCAGC 


180 


GGCTGCCAAC 


GGTGCTGTGA 


CTCTGAGGAC 


CCCCTGGATC 


CTGCCCATGT 


ATCCTCAGCC 


240 


TCTTCCTCCG 


GCCGCCCCCA 


CGCCCTGCCT 


GAGATCAGAC 


CCTACATTAA 


TATCACCATC 


300 


CTGAAGGGTG 


ACAAAGGGGA 


CCCAGGCCCA 


ATGGGCCTGC 


CAGGGTACAT 


GGGCAGGGAG 


360 


GGTCCCCAAG 


GGGAGCCTGG 


CCCTCAGGGC 


AGCAAGGGTG 


ACAAGGGGGA 


GATGGGCAGC 


420 


CCCGGCGCCC 


CGTGCCAGAA 


GCGCTTCTTC 


GCCTTCTCAG 


TGGGCCGCAA 


GACGGCCCTG 


480 


CACAGCGGCG 


AGGACTTCCA 


GACGCTGCTC 


TTCGAAAGGG 


TCTTTGTGAA 


CCTTGATGGG 


540 


TGCTTTGACA 


TGGCGACCGG 


CCAGTTTGCT 


GCTCCCCTGC 


GTGGCATCTA 


CTTCTTCAGC 


600 


CTCAATGTGC 


ACAGCTGGAA 


T T AC AAGGAG 


ACGTACGTGC 


ACATTATGCA 


TAACCAGAAA 


660 


GAGGCTGTCA 


TCCTGTACGC 


GCAGCCCAGC 


GAGCGCAGCA 


TCATGCAGAG 


CCAGAGTGTG 


720 


ATGCTGGACC 


TGGCCTACGG 


GGACCGCGTC 


TGGGTGCGGC 


TCTTCAAGCG 


CCAGCGCGAG 


780 


AACGCCATCT 


ACAGCAACGA 


CTTCGACACC 


TACATCACCT 


TCAGCGGCCA 


CCTCATCAAG 


840 


GCCGAGGACG 


ACTGAGGGCC 


TCTGGGCCAC 


CCTCCCGGCT 


GGAGAGCTCA 


GGTGCTGGTC 


900 


CCGTCCCCTG 


CAGGGCTCAG 


TTTGCACTGC 


TGTGAAGCAG 


GAAGGCCAGG 


GAGGTCCCCG 


960 


GGGACCTGGC 


ATTCTGGGGA 


GACCCTGCTT 


CTATCTTGGC 


TGC CAT CATC 


CCTCCCAGCC 


1020 


TATTTCTGCT 


CCTCTCTTCT 


CTCTTGGACC 


TATTTTAAGA 


AGCTTGCTAA 


CCTAAATATT 


1080 


CTAGAACTTT 


CCCAGCCTCG 


TAGCCCAGCA 


CTTCTCAAAC 


TTGGAAATGC 


ATGCGAATCA 


1140 


CCCGGGGTTC 


GTGTTAAATG 


CAGATTCTG 








1169 



(2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1530 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GBLATUT01 

(B) CLONE: 1824469 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113 : 
TCACAGACTG CGGAGTGGGT CAGGGGCTGC GAGGGCTGCC CCAAGTCCTA CCGGGTTTGC 60 
ACGGGCGCGC CCGGCTCCGC CCGCAAGTGC GCCTTCCTGA CTTACTGCTG GGTGCGCGGG 120 
GCTGGGGGTG CGAGTACCAC CCCTGAAGTC TCTTCCTGGG CGACCTCCGG GGCCTCATTC 180 
TAGGCCTCCT TAAAGAGAAG GATCTAAATT AG G AAAAG G A AGTGCCCTTA TCCACGACCA 24 0 
AGCTCTTCCA CCTGCGGAGC TCGCTTAGTC TGCACCTCAA CCGTGCGGAA AGTGACTGCC 300 
CTGTTTACTG AGGAAAAACT GGGGCTCAGA AAGATACCAT GAGTAGTTTG AAACAGGAAC 3 60 
AAAATCTTCT GAAAGCTCGG AGCAGAAGCC TTTTTGGTCA ACATGGAGGA AAAAAGACGG 420 
CGAGCCCGAG TTCAGGGAGC CTGGGCTGCC CCTGTTAAAA GCCAGGCCAT TGCTCAGCCA 4 80 
GCTACCACTG CTAAGAGCCA TCTCCACCAG AAGCCTGGCC AGACCTGGAA GAACAAAGAG 54 0 
CATCATCTCT CTGACAGAGA GTTTGTGTTC AAAGAACCTC AGCAGGTAGT ACGTAGAGCT 600 
CCTGAGCCAC GAGTGATTGA CAGAGAGGGT GTGTATGAAA TCAGCCTGTC ACCCACAGGT 660 
GTATCTAGGG TCTGTTTGTA TCCTGGCTTT GTTGACGTGA AAGAAGCTGA CTGGATATTG 7 20 
GAACAGCTTT GTCAAGATGT TCCCTGGAAA CAGAGGACCG GC AT CAGAGA GGATATAACT 78 0 
TATCAACAAC CAAGACTTAC AGCATGGTAT GGAGAACTTC CTTACACTTA TTCAAGAATC 840 
ACTATGGAAC CAAATCCTCA CTGGCACCCT GTGCTGCGCA CACTAAAGAA CCGCATTGAA 900 
GAGAACACTG GCCACACCTT CAACTCCTTA CTCTGCAATC TTTATCGCAA T GAGAAGG AC 9 60 
AGCGTGGACT G G C AC AG T G A TGATGAACCC TCACTAGGGA GGTGCCCCAT TATTGCTTCA 1020 
CTAAGTTTTG GTGCCACACG CACATTTGAG AT G AG AAAG A AGCCACCACC AGAAGAGAAT 1080 
GGAGACTACA CATATGTGGA AAGAGTGAAG ATACCCTTGG ATCATGGTAC CTTGTTAATC 114 0 
ATGGAAGGAG CGACACAAGC TGACTGGCAG CATCGAGTGC C C AAAGAAT A CCACTCTAGA 1200 
GAACCGAGAG TGAACCTGAC CTTTCGGACA GTCTATCCAG ACCCTCGAGG GGCACCCTGG 12 60 
TGACGTCAGA GCTTTGAGAG AGAAGCTTCA CTGAAACGGA GCAAACCTTC CACTGAGAAG 1320 
CCACTTCAAG AGGCTGGTGC TGCTAGATCT CATGATGTGG CTGTTGGGAA GATGGTGGGG 1380 
TTTGTTTGCC AGCTTGGAGT CCTATTAAAT GAAAGCCAGC AACTCATGTT GGTAATAGGT 14 4 0 
CTACTGTGGG AACAGTTATC CCTAACCACA GCTCAAAATC GCTATCATCT TTAGGCAAAT 1500 
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TAAAATCTAT GTGGCAGTGA AAAAAAAAAA 1530 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1336 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PRO SNOT 19 

(B) CLONE: 1864292 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 : 
AGCTCGTACC CCTCGAGTGA AATTCTGAAA TGAAGATGGA GGAGGCAGTG GGAAAAGTTG 60 

AAGAACTCAT TGAGTCCGAA GCCCCACCAA AAGCATCTGA ACAAGAGACA GCCAAGGAGG 120 

AAGATGGATC TGTAGAACTG GAATCTCAAG TTCAGAAAGA TGGTGTAGCG GATTCTACAG 18 0 

TTATTTCTTC AATGCCCTGC TTGTTGATGG AACTGAGAAG GGACTCTTCT GAGTCTCAGT 24 0 

TAGCATCCAC AGAGAGTGAC AAGCCTACAA CTGGCCGAGT TTATGAGAGT GACCCCTCTA 300 

ATCACTGCAT GCTTTCCCCT TCCTCTAGTG GTCACCTGGC TGATTCAGAT ACGTTGTCTT 3 60 

CCGCA GAAGA GAATGAACCC TCTCAGGCAG AAACGGCGGT AGAAGGAGAC CCTTCAGGAG 420 

TGTCTGGTGC CACAGTTGGG CGCAAGTCTA GGCGGTCCCG ATCTGAAAGT GAAACTTCCA 48 0 

CTATGGCTGC CAAGAAAAAC CGGCAATCCA GTGATAAACA GAATGGCCGA GTCGCCAAGG 54 0 

TTAAAGGTCA TCGGAGCCAA AAGCACAAGG AG AG GAT C AG GCTACTGAGG CAGAAACGGG 600 

AGGCTGCTGC AAGGAAGAAA TATAACCTGC TGCAGGACAG TAG T AC C AG T GATAGTGACC 660 

TGACTTGTGA CTCAAGCACG AGCTCATCAG AT GAT GAT GA AGAGGTTTCA GGGAGCAGCA 720 

AGACAATCAC TGCAGAGATA C C AG AT G G AC CTCCAGTTGT AGCTCAT TAT GATATGTCTG 780 

ACACCAACTC TGACCCAGAA GTGGTAAATG TGGACAATTT ATTGGCGGCT GCAGTAGTTC 84 0 

AAGAGCACAG TAATTCTGTA GGCGGCCAGG AC AC AG GAG C TACCTGGAGG ACCAGCGGGC 900 

TTCTAGAGGA GCTGAATGCA GAGGCAGGTC ATTTGGATCC AGGATTCCTA GCAAGTGACA 960 

AAACATCTGC TGGCAATGCG CCACTCAATG AAGAAATTAA CATTGCGTCT TCAGATAGTG 1020 

AAGTAGAGAT TGTGGGAGTT CAGGAACATG CAAGGTGTGT TCATCCTCGA GGTGGTGTGA 1080 

TTCAGAGTGT TTCTTCATGG AAGCATGGCT CGGGCACGCA GTATGTTAGC ACCAGGCAAA 114 0 

CACAGTCATG GACTGCTGTG ACTCCCCAGC AGACTTGGGC TTCACCAGCA GAAGTTGTTG 1200 

ACCTTACCTT GGATGAGGAT AGCAGGCGTA AATACCTACT GTAATACAAT GTCACTGTGT 12 60 
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TTCCTCTGCA CTGTTCCCTT CCACTTCCTC 
ATAGGGGTAC GGAGCT 



ATCCTCTTTG TGACATGGAA GTTCATTGTC 1320 

1336 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1742 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP1NOT01 

(B) CLONE: 1866437 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 : 

GCCCCGCCCC CTCCCCGCCC GCCTTCCCGG TGACCTTCAG GGGCCCGGGT GGCGGGCGCA 60 

GGCCCCTGCG GCGGCGGCGG GATGTTCGTG CAGGAGGAGA AGATCTTCGC GGGCAAGGTG 120 

CTGCGGCTGC ACATCTGCGC GTCCGACGGC GCCGAGTGGC TGGAGGAGGC CACCGAGGAC 180 

ACCTCGGTGG AGAAGCTCAA GGAGCGCTGC CTCAAGCACT GTGCTCATGG GAGCTTAGAA 24 0 

GATCCCAAAA GTATAACCCA TCATAAATTA ATCCACGCTG CCTCAGAGAG GGTGCTGAGT 300 

GATGCCAGGA CCATCCTGGA AGAGAACATC CAGGACCAAG ATGTCCTATT ATTGAAAAAA 360 

AAGCGTGCTC CATCACCACT TCCCAAGATG GCTGATGTCT CAGCAGAAGA AAAGAAAAAA 420 

CAAGACCAGA AAGCTCCAGA TAAAGAGGCC ATACTGCGGG CCACCGCCAA CCTGCCCTCC 4 80 

TACAACATGG ACCGGGCCGC GGTCCAGACC AACATGAGAG AC T T C C AG AC AGAACTCCGG 54 0 

AAGATACTGG TGTCTCTCAT CGAGGTGGCG CAGAAGCTGT TAGCGCTGAA CCCAGATGCG 600 

GTGGAATTGT TTAAGAAGGC GAATGCAATG CTGGACGAGG ACGAGGATGA GCGTGTGGAC 6 60 

GAGGCTGCCC TGCGGCAGCT CACGGAGATG GGCTTTCCGG AG AAC AG AG C CACCAAGGCC 720 

CTTCAGCTGA AC C AC AT G T C GGTGCCTCAG GCCATGGAGT GGCTAATTGA ACACGCAGAA 780 

GACCCGACCA TAGACACGCC TCTTCCTGGC CAAGCTCCCC CAGAGGCCGA GGGGGCCACA 84 0 

GCAGCTGCCT CCGAGGCTGC CGCGGGAGCC AGCGCCACCG AT GAG GAG GC CAGAGATGAG 900 

CTGACGGAAA TCTTCAAGAA GATCCGGAGG AAAAGGGAGT TTCGGGCTGA TGCTCGGGCC 960 

GTCATTTCCC T GAT G GAG AT GGGGTTCGAC GAGAAAGAGG TGATAGATGC CCTCAGAGTG 1020 

AAC AAC AAC C AGCAGAAT GC CGCGTGCGAG TGGCTGCTGG GGGACCGGAA GCCCTCTCCG 10 80 

GAGGAGCTGG ACAAGGGCAT CGACCCCGAC AGTCCTCTCT TTCAGGCCAT CCTGGATAAC 114 0 

CCGGTGGTGC AGCTGGGCCT GACCAACCCG AAAACATTGC TAGCATTTGA AGACATGCTG 12 00 
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GAGAACCCAC TGAACAGCAC CCAGTGGATG 
CAGATCTCTA GAATCTTCCA GACACTAAAT 
ATCAGGCCAC AGCAGCCCCC TGGTGCGGCC 
AACTCACCTT CAGCGCCTCA GCCCTGGACT 
TGATCTTATT GCTTATAAAC TTTGGTGACG 
GACAGGTGTT TACAAAAAAG TGGTTGTCGC 
AGTGCTCCTC TGGGCTCTTG AGTTGCTGCT 
GTCCACTTGT TATTTGACGG AGGTAGGTTT 
TTAACTAGTC ACTCACAGAT GACTTTTCTT 
AA 



AATGATCCAG AAACGGGGCC TGTCATGCTG 12 60 
CGCACGTAGG TGGCGTTGTT CCACTCGGCT '1320 
CGAGACCGGG CAGAGTGGAC CTCACCTGGA 1380 
GTTAGAGGTG CTGCAGCTGC TCCTGCTCTC 14 4 0 
GTAGTGTGTA AGGCCGTATT TTTAGCATCT 1500 
ACTGGGAAGT GGAGTGATGG CCTCGTCTCC 1560 
TGAATTGCCG TGTAGACATT TGCTTGGAGA 1620 
CAACCCAGAG TTAATGTCAA GCATGCTAAT 1680 
TAATAAAGTC CCTTTTCCTA TTAAAAAAAA 17 4 0 

1742 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SKINBIT01 

(B) CLONE: 1871375 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 : 
GCGGTGCAGA GGAAGCACAA CCTCTACCGG GACAGCATGG TCATGCACAA CAGCGACCCC 60 
AACCTGCACC TGCTGGCCGA GGGCGCCCCC ATCGACTGGG GCGAGGAGTA CAGCAACAGC 12 0 
GGCGGGGGCG GCAGCCCAGC CCCAGCACCC CGGAGTCAGC CACCCTCTCG GAAAAGCGAC 18 0 
GGCGCGCCAA GCAGGTGGTC TCTGTGGTCC AGGATGAGGA GGTGGGGCTG CCCTTTGAGG 24 0 
CTAGCCCTGA GTCACCACCA CCTGCGTCCC CGGACGGTGT CACTGAGATC CGAGGCCTGC 300 
TGGCCCAAGG TCTGCGGCCT GAGAGCCCCC CACCAGCCGG CCCCCTGCTC AACGGGGCCC 3 60 
CCGCTGGGGA GAGTCCCCAG CCTAAGGCCG CCCCCGAGGC CTCCTCGCCG CCTGCCTCAC 4 20 
CCCTCCAGCA TCTCCTGCCT GGAAAGGCTG TGGACCTTGG GCCCCCCAAG CCCAGCGACC 4 80 
AGGAGACTGG AGAGCAGGTG TCCAGCCCCA GCAGCCACCC CGCCCTCCAC ACCACCACCG 54 0 
AGGACNANTT TCAAGGGGTG CAAGAATTGA AGNTTCNTAA GGGCCAANTT GGGGGTCCCC 600 
TTGACTTGGN TTGGNAANAT TGGGGCAAAA AGGGCCGGTT TTCCCCNTTT CCCGGGANAC 660 
CCCAAGGGAA AGGGGNTTCA AAGCTTCTTN GGGGGGGAAA GGGGGAANCC CTTGGGTNTT 72 0 
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TTGTTGGCCN TTTGTGANCA NCAGCGAGGA 
CANTGGGTCC CTGACTGCTG CANATGGTAA 
AAGNTGTGGG GAGGGAAGCT GGTNTGTGCN 
NANGGCAGGG AGAGGGCNAA NTGAGTTATT 
CCCTGTNTTG TGTTNCTGTG GGATTGATTN 
TTCCTGGTTG GTGGCCAAAN GGGTTGGAAA 



GAGTGCAAAG GTGCAGAGTN AGTTNTAGGN 780 
GGNCGTTNNC TTGTGGACCC AAGGCAGGNA 84 0 
TTGTGGGTGG AAGCGGGGAN GGCTGTGTTG 900 
TATTGGGGTT CAN G T G AAAA GTTTCTTGNN 960 
TAAGATNGNN AGGGGTNGGT TTTTGGGGTT 1020 
ATNGNTGGGG GGGGNTTGGA NAAT 107 4 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1454 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LEUKNOT03 

(B) CLONE: 1880830 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 : 
CCCGGGGGAG GCCTGACCCC CTCCGCACCA CCGTACGGAG CCGCATTTCC CCCGTTTCCC 60 
GAGGGGCATC CAGCCGTGTT GCCTGGGGAG GACCCACCCC CCTATTCACC CTTAACTAGC 120 
CCGGACAGTG GGAGTGCCCC TAT GAT C AC C TGCCGAGTCT GCCAATCTCT CAT CAACGTG 180 
GAAGGCAAGA TGCATCAGCA TGTAGTCAAA TGTGGTGTCT GCAATGAAGC CACCCCAATC 24 0 
AAGAATGCAC CCCCAGGGAA AAAATATGTT CGATGCCCCT GTAACTGTCT CCTTATCTGC 300 
AAAGTGACAT CCCAACGGAT TGCATGCCCT CGGCCCTACT GCAAAAGAAT CAT CAACCTG 360 
GGGCCTGTGC ATCCCGGACC TCTGAGTCCA GAACCCCAAC CCATGGGTGT CAGGGTTATC 420 
TGTGGACATT GCAAGAATAC TTTTCTGTGG ACAGAGTTCA CAGACCGCAC TTTGGCACGT 48 0 
TGTCCTCACT GCAGGAAAGT GT CATC TAT T GGGCGCAGAT ACCCACGTAA GAGATGTATC 54 0 
TGCTGCTTCT TGCTTGGCTT GCTTTTGGCA GTCACTGCCA CTGGCCTTGC CTTTGGCACA 600 
TGGAAGCATG CACGGCGATA TGGAGGCATC TATGCAGCCT GGGCATTTGT CATCCTGTTG 660 
GCTGTGCTGT GTTTGGGCCG GGCTCTTTAT TGGGCCTGTA TGAAGGTCAG CCACCCTGTC 720 
CAGAACTTCT CCTGAGCCTG ATGACCCACA GACTGTGCCT GGCCCCTCCC TGGTGGGGAC 7 80 
AGTGACACTA CGAAGGGAGC TGGGGTAGTT AAAGGCTCCC GGGGCTTCTA GAAGGAAGCC 8 40 
AAGCAGCTGC CTTCCTTTTC CCTGGGGAGA GGTAGGAAGG AACCAGGCCC TCACTTAGGT 900 
TTGGAGGGGC AGATAAGAGC ACTGCTGACC ATCTGCTTTC CTCCAAGGGT TGCTGTGTCT 960 
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AGGGTGAAGT AGGCAAAACG TTGCCCTTAA AACTGGGCCC TGAAGACGGT TCCAGCCTTG 1020 

TCCTTCCTGT GTGCTCCCTG AGAGCCATTC CTGTCCCTTA CACATTCCAG GGCAGGGTGG 1080 

GGGTGGGTAG CCCTGGGGGT TCCCCTCCCT CTTGTGCACC ATTAGGACTT TGCTGCTGCT 1140 

ATTGCACTTC ACCAGAGGTT GGCTCTGGCC TCAGTACCCT CAGTCTCCTC TCCCCACATT 1200 

GTGTCCTGTG GGGGTGGGGT CAGCCGCTGC TCTGTACAGA ACCACAGGAA CTGATGTGTA 12 60 

TATAACTATT TAATGTGGGA TATGTTCCCC TATTCCTGTA TTTCCCTTAA TTCCTCCTCC 132 0 

CGACCTTTTT TACCCCCCCA GTTGCAGTAT TTAACTGGGC TGGGTAGGGT TGCTCAGTCT 1380 

TTGGGGGAGG TTAGGGACTT ATCCTGTGCT TGTAAATAAA TAAGGTCATG ACTCTAAAAA 14 40 
AAAAAAAAGG GCGG 14 54 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARNOT07 

(B) CLONE: 1905325 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118 : 

AGCTTTGAAT TCCTGTATCT GAGAACGGAT CGTTCGAGGT GGTGGAGGGG GTTGGAATTG 60 

GGGACCTACG GAAGGCTCAG CTCTTGCCAG GCCAAATTGA GACATGTCTG ACACAAGCGA 120 

GAGTGGTGCA GGTCTAACTC GCTTCCAGGC TGAAGCTTCA G AAAAG G AC A GTAGCTCGAT 180 

GATGCAGACT CTGTTGACAG TGACCCAGAA TGTGGAGGTC CCAGAGACAC CGAAGGCCTC 24 0 

AAAGGCACTG GAGGTCTCAG AGGATGTGAA GGTCTCAAAA GCCTCTGGGG TCTCAAAGGC 300 

CACAGAGGTC TCAAAGACCC CAGAGGCTCG GGAGGCACCT GCCACCCAGG CCTCGTCTAC 3 60 

TACTCAGCTG ACTGATACCC AGGTTCTGGC AGCTGAAAAC AAGAGTCTAG CAGCTGACAC 4 20 

CAAGAAACAG AATGCTGACC CGCAGGCTGT GACAATGCCT GCCACTGAGA CCAAAAAGGT 4 80 

CAGCCATGTG GCTGATACAA AGGTCAATAC AAAGGCTCAG GAGACTGAGG CTGCACCCTC 54 0 

TCAGGCCCCA GCAGATGAAC CTGAGCCTGA GAGTGCAGCT GCCCAGTCTC AGGAGAATCA 600 

GGATACTCGG CCCAAGGTCA AAGC CAAGAA AGCCCGAAAG GTGAAGCATC TGGATGGGGA 660 

AGAGGATGGC AGCAGTGATC AGAGTCAGGC TTCTGGAACC ACAGGTGGCC GAAGGGTCTC 720 

AAAGGCTCTA ATGGCCTCAA TGGCCCGCAG GTTTCAAGGG GTCCCATAGC CTTTTGGGCC 780 
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CGCAGGATTC AAGGACTCGG TTGGCTGCTT GGGCCCGGAG AGCCTTGCTC TCCCTGAGAT 84 0 
CACCTAAAGC CCGTAGGGCA AGGCTCGCCG TAGAGCTGCC AAGCTCCAGT CATCCCAAGA 900 
GCCTGAAGCA CCACCACCTC GGGATGTGGC CCTTTTGCAA GGGAGGGCAA ATGATTTGGT 960 
GAAGTACCTT TTGGCTAAAG ACCAGACGAA GATTCCCATC AAGCGCTCGG ACATGCTGAA 1020 
GG AC AT CATC AAAGAATACA CTGATGTGTA CCCCGAAATC ATTGAACGAG CAGGCTATTC 1080 
CTTGGAGAAG GTATTTGGGA TTCAATTGAA GGAAATTGAT AAGAATGACC ACTTGTACAT 1140 
TCTTCTCAGC ACCTTAGAGC CCACTGATGC AGGCATACTG GGAACGACTA AGGACTCACC 12O0 
CAAGCTGGGT CTGCTCATGG TGCTTCTTAG CATCATCTTC ATGAATGGAA ATCGGTCCAG 12 60 
TGAGGCTGTC ATCTGGGAGG TGCTGCGCAA GTTGGGGCTG CGCCCTGGGA TACATCATTC 1320 
ACTCTTTGGG GACGTGAAGA AGCTCATCAC TGATGAGTTT GTGAAGCAGA AGTACCTGGA 1380 
CTATGCCAGA GTCCCCAATA GCAATCCCCC TGAATATGAG TTCTTCTGGG GCCTGCGCTC 1440 
T TACT AT GAG ACCAGCAAGA TGAAAGTCCT CAAGTTTGCC TGCAAGGTAC AAAAGAAG G A 1500 
TCCCAAGGAA TGGGCAGCTC AGTACCGAGA GGCGATGGAA GCAGATTTGA AGGCTGCAGC 15 60 
TGAGGCTGCA GCTGAAGCCA AGGCTAGGGC CGAGATTAGA GCTCGAATGG GCATTGGGCT 1620 
CGGCTCGGAG AATGCTGCCG GGCCCTGCAA CTGGGACGAA GCTGATATCG GACCCTGGGC 1680 
CAAAGCCCGG ATCCAGGCGG GAGCAGAAGC T AAAGC CAAA GCCCAAGAGA GTGGCAGTGC 17 4 0 
CAGCACTGGT GCCAGTACCA GTACCAATAA CAGTGCCAGT GCCAGTGCCA GCACCAGTGG 18 00 
TGGCTTCAGT GCTGGTGCCA GCCTGACCGC CACTCTCACA TTTGGGCTCT TCGCTGGCCT 18 60 
TGGTGGAGCT GGTGCCAGCA CCAGTGGCAG CTCTGGTGCC TGTGGTTTCT CCTACAAGTG 1920 
AGATTTTAGA TATTGTTAAT CCTGCCAGTC TTTCTCTTCA AGCCAGGGTG CATCCTCAGA 198 0 
AACCTACTCA ACACAGCACT CTAGGCAGCC ACTATCAATC AATTGAAGTT GACACTCTGC 204 0 
ATTAAATCTA TTTGCCATTT CAAAAAAAAA A 2071 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1236 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE; 

(A) LIBRARY: BRSTTUT01 

(B) CLONE: 1919931 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 : 
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ACCTGGGACC CCCAGAACGG CCGCCCCTTT 
TTTTTTTTTT , j M j"j"j i, j' i p r ["j , TT ( j ,r j"j , TTTTTTT 
TTATTTTCAT CTTCTTTCTG CCATCTTTTA 
TGAGACAGAA TGAAACACAT ATCCAAATTT 
ATGAGAACTC TAGAAAATCA AGGTTTCAAA 
TGGCAGAAAC AGAACAAATT AAT CAGCAGA 
TTTCTTGATT CTGGGAGCAT CTGGGTCCAG 
TGCTTCTTGT TTCATCCATG GACCCTGCAA 
ACTGAGAAAG GAAGCATGAA GGTCGCACAG 
GGGTTTGCCC TCCCAATCCT GGGGTTGCTT 
CACATTTCTC AGGTTTCTGC TCAAAAGTCA 
GTTTGTAAGA GCTCCTTCAG TTTCTTTCTA 
ACCACCACCA TCTGACCTGG TCTTATGACC 
AGGATGGCAG GGGCCTCATC TGTCCTGTTC 
TGGTACAGTG TAGATGCTCA AGGGAAGTTT 
TACTGTTAGT CTAACCTGTA CCATTTTGTA 
CAGTGAAACG TTCCATGGGA ACTTGGGCCA 
C AGAAAC AT C CTTATCGCGT CCTCCTGGGC 
CCTTGCCATA TCCTGCTGGG CAGCAAGCTC 
TGATTACCCC AGCCTGTGAG TGGCAGTTGG 
AGGGTTTTGC T G G C AAT AAA GATGTTGCTG 




TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT 60 
TTTTTTTTAG AAGGTTGAAA CCAGGCTTAT 120 
ACCAACCTTC T C AGAAT AAA ATGTGATTTT 180 
TAATACAGTA AGAAT AG GT A TCCTGAATAA 2 40 
ATTCTACCCT TCCTGGGAGT TAAAGAAGTT 300 
TTCATCACCT GCCAATTTTT TCTGTACAAT 3 60 
GCAGATTTTC CTCCCATCCT TCAGTGTGGC 4 20 
GAAATTGCCC CATGTTTCTG TTTGTGCATC 480 
GTCAGGCCAT TCCATTGCCC TCCTGGTGCC 54 0 
CAGGGGCTTG TCATTCTCCA TAGTCCCCTC 600 
CCTTTTGGAG GGGTCTCCAC CTGTCACTGT 660 
GCTCATCTCA CTCTGGTAAT GTCTTTGATT 72 0 
TGTTAGCTTT CTTCATCAGA CGTGAGCACC 7 80 
CTCCTGTGGC CTGGGTCCTA GCACCATGTC 84 0 
ACTTTGTAAA ACCACTTACC TGGGAGATGT 900 
AACCTCCAGC CATTTTGCAG ACTCTGATCA 960 
T GAGAAAC AT CCTTCCTAAC CACGTGACTG 1020 
AAAGGCCCAA CAGCCTGACT GCAGGGACAT 108 0 
TACCACCCAG ATCCCTCCCT CCCAGTCCCA 114 0 
TGCTGGCACT AAGCTGGTTT CCTCCTCCCC 1200 
TTGAAG 1236 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1391 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT04 

(B) CLONE: 1969426 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 : 
GTACTGCCCA CCACCTCCCT GGGCCACCCC TCACTCAGTG CTCCGGCTCT CTCCTCCTCC 60 



242 



o 



PF-0459 US 

TCTTCGTCCT CCTCCACTTC ATCTCCTGTT 
CCTGGGGCCT CCCCCCACCA CCGCCGTGTG 
CCAGCCGACG CCAGAAGGTC CCAACAGCAG 
CCCACCTTGT CTTCCATCAC TCAGGGCGTC 
CAGCGGTTAC CCCCATACCC ATACAGCTCC 
ACCCCAAAGT CTCTACAGCA GCCAGGGCTG 
GGTGGGCAGC CCCCAGGCAG GCAGTCTCAT 
GGGCATGGGC AACAGTCTTA CCACCGGCCA 
CAGTTCAGCA TGGAGAGCCC ATCAGCCAGC 
GGGCCTGGAT TTTTAGGGGG TGAGGGGCCA 
AACCACCAGA ACTTGACCCA CTGTTCCCGC 
GGGGACTCCT CTCCAGGTTT CTCTAAGGAG 
TTTGAGGTGT CAGCAGCTGG ATTGGAGCTA 
GAGCCACTGG GCCTGGAAGG GCTAAACATG 
CCTGCTGTGG AGGAGTCATT CCGCAGTGAC 
CCTCTTCTTG GCCCCATCCC CCACCACCAT 
GACTCTACTC TCTGTCCCCA GATCCTCTTT 
AAAAAGCAAG GGGTTTGTCC AGGTGGCCCC 
AACTCAAGGG AGGGCCTAAA GCACTTGTAA 
TGTTGGAAAG CAGGGGTAGA GGGGAGCCCT 
TGGGCAGTGC CAGCCCCTCC TCACCACTCT 
GGATACTATT TTTTATTAAA TATATTATTA 
AAAAAAAAAG G 



TTGGGCTCCC CCTCTTACCC TGCTTCTTCC 120 
CCCCTCAGCC CCCTGAGTTT GCTCGCGGGC 180 
CTGCCCAAAC AGTTTTCGCC AACAATGTCA 24 0 
CCCCTGGATA CCAGTAAACT GTCCACTGAC 300 
CCAAGTCTGG TTCTGCCTAC CCAGCCCCAC 360 
CCCTCTCAGT CTTGTTCAGT GCAGTCCTCA 4 20 
TATGGGACAC CGTACCCACC TGGGCCCAGT 4 80 
AT GAG TG ACT TCAACCTGGG GAATCTGGAG 54 0 
CTGGTGCTGG ATCCCCCTGG CTTTTCTGAA 600 
ATGGGTGGCC CCCAGGATCC CCACACCTTC 660 
CATGGCTCAG GGCCTAACAT CATCCTCACA 720 
ATTGCAGCAG CCCTGGCCGG AGTGCCTGGC 7 80 
GGGCTTGGGC TAGAAGATGA GCTGCGCATG 84 0 
CTGAGTGACC CCTGTGCCCT GCTGCCTGAT 900 
CGGCTCCAAT GAGGGCACCT CATCACCATC 9 60 
TCCTTTCCTC CCTTCCCCCT GGCAGGTAGA 1020 
CTAGCATGAA TGAAGGATGC CAAGAATGAG 1080 
TGAATTCTGC GCAAGGGATG GGCCTGGGGG 114 0 
CTTTGAACCG TCTGTCTGGA GGTCAGAGCC 1200 
GGAAGCAGGG CTTTTCCGGA TGCCTAGGGG 12 60 
TCCCCTTGCA GTGGAGGAGA GAGCCAGAGT 1320 
TATGTTAATA AAAAAATCAT ATCAAAAAAA 1380 

1391 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2183 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UCMCL5T01 

(B) CLONE: 1969948 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 : 
CTCTGTGAAC AT AT GAT GAG AGAAGCCAAG ATCATGCAGT ATAAGTACCT ACTGTTCAGT 60 
CTTCACGCCA TAGTGAAGCT TGGAATCCCT C AG AAC AC T A TTTTGGTGCA GACTTTGCTG 120 
AGGGTGACCC AGGAACGTAT CAATGAGTGT GAT GAG AT AT GCCTTTCAGT TTTGTCAACT 18 0 
GTTTTAGAGG CAATGGAACC ATGCAAGAAT GTTCATGTTC TACGAAC GGG ATTCAGAATA 24 0 
CTAGTTGATC AGCAAGTTTG GAAAATAGAA GATGTCTTCA CATTACAAGT TGTGATGAAG 300 
TGTATTGGAA AAGATGCACC GATTGCTCTT AAGAGGAAAC TGGAGATGAA AGCCTTGAGG 360 
GGATTAGACA GATTTTCTGT TTTGAATAGC CAACACATGT TTGAAGTACT AGCTGCCATG 4 20 
AATCACCGAT CTCTTATACT CCTGGATGAA TGCAGTAAGG TGGTCCTAGA TAATATCCAT 4 80 
GGGTGTCCTT TAAGAATAAT GATCAACATA TTGCAGTCCT GCAAAGACCT CCAGTACCAT 54 0 
AATTTGGATC TCTTCAAGGG ACTTGCAGAT TATGTGGCTG CAACTTTCGA CATCTGGAAG 600 
TTCAGAAAAG TTCTTTTTAT CCTCATTTTA TTTGAAAACC TTGGCTTTCG ACCTGTTGGT 660 
TTAATGGACC TGTTTATGAA GAGAATAGTA GAGGATCCTG AATCCCTAAA CATGAAAAAC 720 
ATTCTATCTA TTCTTCATAC TTACTCTTCT CTCAATCATG TCTACAAATG C C AG AAC AAA 780 
GAACAGTTCG TGGAAGTTAT GGCTAGTGCT CTGACTGGTT ATCTTCACAC TATTTCTTCT 840 
GAAAAC T TAT TGGATGCAGT ATATTCATTT TGCTTGATGA ATTACTTTCC CCTGGCTCCT 900 
TTTAAT CAGC TTCTGCAAAA AGACAT CAT C AGTGAGCTGC T G AC AT C AG A TGACATGAAG 960 
AATGCTTACA AGCTGCATAC TTTGGATACT TGTCTAAAAC TT GAT GAT AC TGTCTATCTG 1020 
AGGGACATAG CCTTGTCACT CCCACAGCTG CCGCGGGAGC TGCCATCGTC ACATACAAAT 1080 
GCAAAGGTGG CAGAGGTGCT GAGCAGCCTT CTGGGAGGTG AAGGACACTT CTCAAAGGAT 114 0 
GTGCACTTGC CACACAATTA TCATATTGAT TTTGAAATCA GAATGGACAC TAACAGGAAT 1200 
CAAGTGCTAC CACTTTCTGA TGTGGATACA ACTTCTGCTA CAGATATTCA AAGAGTAGCT 12 60 
GTGCTATGTG TTTCCAGATC TGCTTATTGT TTGGGTTCAA GCCACCCCAG AGGATTCCTT 1320 
GCTATGAAAA TGCGGCATTT GAATGCAATG GGTTTTCATG TGATCTTGGT CAATAACTGG 1380 
GAGATGGACA AACTAGAGAT GGAAGATGCA GTCACATTTT TGAAGACTAA AATCTATTCA 14 4 0 
GTAGAAGCTC TTCCTGTTGC TGCTGTAAAT GTGCAAAGCA CACAATAAAG T GAAAAT C AA 1500 
CCTTTTCATA TTAGGAGACA TGCATTTGTA AAAATTAATA AAGATGACAA GTCAGTTGTC 1560 
AATGGAATTG AGCTATCTGC TAAGACAAAA AATGTTACCT CAGTTCACTA TTAAAATTAA 1620 
TTTTAGGAGT GGAAGAAATG TTGTTACTGC CAT T T AAAAA TAT GCT GAGA AAATTCCAGA 168 0 
AGGGTTATTT TTCCAACCAC ACCTATTCCC TCTAGTGCCC AGATATTTGA TTTGTGAGCT 17 4 0 
GTACGTTTCA CCTTTTCATC TTTGATCTAC TAAAAACTGG TTTCTTAGTT GTGAGGTGTC 1800 
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ACAGGCAGGT TGATGTGGGT AGTAGTCCTT 
CTCTAAGCTG TTCTAAGACT TGGGGTTATG 
TAATAATTGG ACAAAGTTCT AAAGTTGTCA 
AAAAT GAG T G AGGATGGTAT TTGTATTTGT 
GGTTATAATA AGAGAAACAC AG AT GAG AT G 
GTCCTTCCTT TCTCTTTCTT TTTTCCCCCT 
AAAT T AAAT A TATAGCTTNA ATT 



ft 

GTCTTTGGAA TCTGAATATT TATACTCCTG 18 60 
CCTTTAAATC ATTTTCAAGC AT T GGCCAAA 1920 
AGTGTGTAAG AATTAGTGAG GTAGCTGTTG 1980 
AATAAGCACT GCAGGTAGAG ATATTTCATG 204 0 
TAGATGGTAA GGAGTCTTAC TGTTGTTGGG 2100 
TACCCCTCCC ACAATTTCAT GAAGTCTTTT 2160 

2183 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGAST01 

(B) CLONE: 1988911 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122 : 



AGAACCACTG 


CAGTGGAGAC 


TCCATGTGCA 


AAAGAAAAAA 


ACCAAATGTG 


AGGTCATAAA 


60 


GACTTTCTGC 


CAGCATGTGG 


GTGACATTGT 


TTCTTTGCAG 


ATTTTGGCTA 


TGGAAAGGGG 


120 


AAATGTTCTA 


AGCAGAGCCC 


CGTCAAGAGC 


CCACGGGACA 


CATTTTGGAG 


ATGACAGATT 


180 


TGAAGATCTG 


GAAGAGGCAA 


ATCCATTCTC 


TTTTAGAGAG 


TTTCTGAAGA 


CCAAGAACCT 


240 


CGGCCTCTCG 


AAAGAGGATC 


CGGCCAGCAG 


AATTTATGCA 


AAGGAAGCCT 


CGAGGCATTC 


300 


CCTGGGACTT 


GACCACAACT 


CCCCACCCTC 


CCAAACCGGC 


GGGTATGGCC 


TGGAGTATCA 


360 


GCAGCCATTT 


TTCGAGGATC 


CGACAGGGGC 


TGGTGACCTC 


CTGGATGAGG 


AGGAGGATGA 


420 


GGACACCGGA 


TGGAGTGGGG 


CCTACCTGCC 


GTCCGCCATC 


GAGCAGACTC 


ACCCCGAGAG 


480 


GGTCCCTGCC 


GGCACGTCGC 


CCTGCAGCAC 


ATACCTTTCC 


TTTTTCTCCA 


CCCCGTCGGA 


540 


GCTGGCAGGG 


CCTGAGTCTC 


TGCCCTCGTG 


GGCGTTGAGT 


GACACTGATT 


CTCGCGTGTC 


600 


TCCGGCCTCT 


CCGGCAGGGA 


GTCCTAGCGC 


AGACTTTGCG 


GTTCATGGAG 


AGTCTCTGGG 


660 


AGACAGGCAC 


CTGCGGACGC 


TGCAGATAAG 


TTACGACGCA 


CTGAAAGATG 


AAAAT TCTAA 


720 


GCTGAGAAGA 


AAGCTGAATG 


AGGTTCAGAG 


CTTCTCTGAA 


GCTCAAACAG 


AAAT GGT GAG 


780 


GACGCTTGAG 


CGGAAGTTAG 


AAGCAAAAAT 


GATCAAGGAG 


GAAAGCGACT 


ACCACGACCT 


840 


GGAGTCGGTG 


GTTCAGCAGG 


TGGAGCAGAA 


CCTGGAGCTG 


ATGACCAAAC 


GGGCTGTAAA 


900 
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GGCAGAAAAC CACGTCGTGA AACTAAAACA 
CAACTTCCAG CGAGAGAATG AAGCCCTGCG 
GAAGCAGAAC GCCGACGTGG CCCTGCAGAA 
TTCCATCAAG CAACTGGTTT CCGGAGCTGA 
AT C TAT AG AC AGAATTTCTG AAGTTAAAGA 
TGTTCTCAGC ATGAAGCTCC GTGTATACCC 
AGTTGTGTCC TTAAATATGC AGTCTTCACC 
TGTCGTGCCC TCAGCCAGTT CTTGGCCACC 
CTGTGGTTTC TATGCAGCCC TTCTTGGCGA 
CTCTTGGAAG ACATTGTCAT AAAAGCCAGT 
AATGTTTTCC AGTCCCATCC CAGAACAT C A 
GAT T T CAT AA GTAGAACAAA CACTAAATGT 
CTGTGCCTTC CGCCGATGCT CAGGGCTCCC 
GTGGGTGGTC CCTGCAGTCA TGGCCTGAGT 
CCTCCTCCGG GACCCACGGG GACCAAGGTC 
GTGCCTCTTT GGCTGGGGGT TCTGGTGGAC 
TAGAATTCAG GAATTTCAAG TATGTGCCCG 
GGCCCCCCTC AGAGGGACGG CGATGAGCAC 
TTTTTTTTAA AACTTTTTTT TCCTCCTGTT 
CTGTAGTTCA CCGCAAAAAA AAAAAA 




GGAAATCAGT TTGCTCCAGG CGCAGGTCTC 960 
GTGCGGCCAG GGTGCCAGCC TGACCGTGGT 1020 
CCTCCGGGTG GTCATGAACA GTGCACAGGC 1080 
GACACTGAAT CTTGTTGCCG AAATCCTTAA 114 0 
CGAGGAGGAA GACTCTTGAG GACCCCTGGG 1200 
TGAGGTCACC ACCGCTCGAT CTAAATGTGC 1260 
CAGAGTAAAG TGTTGATCGC AAGAGTCCAG 1320 
ACAATGGGAG CAGCCCTGGC CGAGTTGTCT 138 0 
AATTCCTGCG ATCTTATAGA TTCTAATGAG 14 40 
GATTTTAAGA AAAAGAGTGG TTCTGGAATC 1500 
GTTGTAAGAT AAGTACAATT GGTTGTCCTT 15 60 
GCCTCTGAGA TGGCCACCCC GGGCAGGGAC 1620 
TCTGGCTCCC GGGTCACTCT TGTGGCCCCA 1680 
GCGCAGGGGC CACCGCGTGG CTGCTGCTGT 17 4 0 
ACACGTTCCG TGCTGTGAAG CTGTCCAGAT 18 00 
GTTTCAAGTG GCATTTTGTA CAAT GCAGGT 18 60 
GGTCTGTCAG GTCCCAGTTG CCTTTCTGAC 1920 
TAAATGCTTT TTTGACTATT TTCCTATAGA 1980 
C CAAT T GAT A GCTTTCTTAT TTAATAAATT 2 04 0 

2066 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1867 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARNOT03 

(B) CLONE: 2061561 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123 : 
TGGCCAGGCT GGTCTAGAAC TCCTGACTGC AAATGATCAG CCCGCCTCAG CCACCCAAAG 60 
TGTTGGGATT ACAGGTGTGA GCCACTGTGC CCAGCGTGAT TTTTTTTTTT TTTAAAGCAA 120 
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ACTTGTCCTT TGGTTTTGCA GAACAGGCCT GCTCCCTCTC ATCTAGCCCA TCATTTCTTG 180 
GGGCCTGAAC CCCAGTGGTC CAAAGTATTG CTTGTGAAAT TTAAAAAATG TGAATATGAT 2 40 
GTGGGGATGG GCCTCTTCTA CATTACCTTG GCCCAGGGGG ATCAGCTGGC TGGGAGGATT 300 
AGTGAGCACC TCTGTATTTT GAGGTCTGAG TCTTCTGGAG CTGTGTAGTT AATCTTCGGT 360 
TTCTGATAAC CCCTGGGTCC ATCTGGCCAT CAGCCTCAGC AGTGAGCAAA GCAATACCAT 4 20 
ACTCATTTCT ATGTTCCTGT TCCTTCCTCT GCTCCTCCTT TGGAGAAGCA ATAATTCATG 480 
GGGGATGATA CAGTAGCACT TTACAAATGG CTCCATGTCA TTCATCCCAG GGGCCATAAT 54 0 
CTCTTGCACC ACCTATTCTT ACTTCCTGTT CAGCTCCTTT ACAGCTTTTA TTTTCAACTG 600 
CTTCCCAACT TGGTGGGGCC TCCTTTAAGG ATGAGCCAAT AGTAAGAATG TGGCTGTAAT 660 
CAGCAGAGAC CCCTCTGAGG GGTATCTGTT CTGCAGCCCC TAGTGAAATC ATGTGATGTG 7 20 
AGACAGAAAC CTAAACATGG TACTTGATTC TAAACCTGTG CCAGTCTATA GCCTCTGCCT 780 
CCCCAAGCAG AGCTCAAGCC AAACGCTTCT GTCCTCTTTC CTTCTGCATT AACCCTTTGC 840 
TGATCCTCAG GGGCCACTCC CCCAACACCC CTGTACTTGG GTGAGGGATG TTGGACAGAG 900 
CCTGTTTTCA TGTACTGCAG GTGGGGGTGT GCTGACATGT TTGCTCTTGG TTGATGGAGA 960 
AGGTACAGAG GCCAGGGAGT GAAAATGGTT GACAGAAGAG GGAAGAGTTA GGTGTCTCAT 1020 
AGTCACTCAT AGTGGGGTGG TCAGGGGTAA TGGCATCTCC CCACTTTAGG CTTCTCAAAC 108 0 
AGACTTTTGA CACCTCTCAA GTTCAGAGCT CTGATGTGGA AAGACAGGAG GTGTGGGGAA 114 0 
GGAGGGGGAT TTCGTGTGTT TGCATGAGTG TGCGCTTCAG GCCTTGGGAG TTGGCAAGAG 12 00 
GGAGGGAAGG AAGGAGAGCA AAATCTTCGG AAGGTGTTTC TTGTACCTGA GGGATCCTGC 12 60 
CCTGAATCTC CATAGTCTCC ACTGTGAACT GAGGAGGGGA GGGGTGTGCT GGGGAATAAA 1320 
TCTTGTATGA GAACAATCAA AAATCAAACG AATCCCACCG ACAGACTGCT GCTCCTAGTG 138 0 
ATCTGGACTC ACCTAGGGGG CATCTGGGCT GGGGTTCCAN GCTTACGTNC GCGTGNATGN 14 4 0 
GACGNCANAG CTCTTCGAAA GTGTCCCNAA ANTNCAATTC ATTGGCGGTG GTTTTAAAAG 1500 
TTCGGGCCTG GGAAACCCGG GGGNTTACCC ATTTTATCCC NCTTNGANGG CANATTCCCC 1560 
TTTTTCCCCA ATTTGGGGAA AT T TNCCAAA NGGGNCCCGT AACGGTTGGC CTTTTCCCAA 1620 
AATTTNGGNC GCCCTTAATT GGGGCGATTG TGGGACCCGC GCCCTTTATA GGGGGGGGCT 168 0 
TTAAAGCGGC GCNGGGGGTT CTTTGGGTGA TTACCGGCGC GGTTGACCCC GGGTAAAATA 17 4 0 
TTGACAAGGG CCCTTTAGCG CGCGGTTCCT TGTGGGGTTT TCCTCCCATT TGCTTTTTCC 1800 
GCAAAAGTTT TGGCGGGGTT TTCCCCGGAA AAGGTCTTAA AAAGCGGTGT GCCCCTCTTT 18 60 
GAGGGGG 1867 
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(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1628 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PANCNOTO 4 

(B) CLONE: 2084489 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124 : 



CTCTGGGTCT 


GTAGCAACCG 


CCCAGCGTTG 


AGGCGCGGCT 


CATGCCCCCA 


GTATCCCGGT 


60 


CCAGCTATTC 


CGAGGACATC 


GTGGGCTCTC 


GGAGAAGGCG 


ACGCAGCTCC 


TCGGGGAGCC 


120 


CACCATCCCC 


GCAGAGCAGA 


TGTTCCTCTT 


GGGATGGCTG 


TTCCCGCTCT 


CACTCCCGCG 


180 


GCCGTGAGGG 


CCTCAGGCCT 


CCTTGGAGTG 


AGTTGGACGT 


GGGCGCTCTT 


TACCCCTTTA 


240 


GTCGCTCTGG 


GTCGCGAGGG 


CGGCTCCCAA 


GATTCCGCAA 


CTACGCCTTC 


GCGTCCTCCT 


300 


GGTCGACCTC 


GTATAGTGGA 


TATCGCTACC 


ATCGTCACTG 


CTATGCAGAA 


GAACGGCAGT 


360 


CAGCGGAAGA 


CTACGAGAAG 


GAAGAGAGCC 


ATCGGCAGAG 


GAGGCTGAAG 


GAGAGAGAGA 


420 


GGATTGGGGA 


ATTGGGAGCG 


CCTGAAGTGT 


GGGGGCCGTC 


TCCAAAGTTC 


CCTCAGCTAG 


480 


ATTCTGACGA 


ACATACCCCA 


GTTGAGGATG 


AAGAAGAGGT 


AACGCATCAG 


AAAAGCAGCA 


540 


GTTCAGATTC 


CAACTCGGAA 


GAACATAGGA 


AAAAGAAGAC 


CAGTCGTTCA 


AGAAACAAGA 


600 


AAAAAAGAAA 


GAATAAGTCG 


TCTAAAAGAA 


AG CAT AG GAA 


ATATTCTGAT 


AGTGACAGTA 


660 


ACTCAGAGTC 


TGACACAAAT 


TCTGACTCTG 


AT GAT GAT AA 


AAAGAGAGTT 


AAAGCCAAGA 


720 


AGAAAAAGAA 


GAAAAAGAAA 


CACAAAACAA 


AGAAAAAGAA 


GAATAAGAAA 


ACCAAAAAAG 


780 


AATCCAGTGA 


CTCAAGCTGT 


AAAGACTCAG 


AAGAGGACTT 


GTCAGAAGCT 


ACCTGGATGG 


840 


AGCAGCCAAA 


TGTGGCAGAT 


ACTATGGATT 


TAATAGGGCC 


AGAAGCACCT 


ATAATACATA 


900 


CCTCTCAAGA 


TGAAAAACCT 


TTGAAGTATG 


GCCATGCTTT 


GCTTCCCGGT 


GAAGGTGCAG 


960 


CTATGGCTGA 


GTATGTAAAA 


GCTGGAAAGC 


GAATCCCACG 


AAGAGGTGAA 


ATTGGGTTGA 


1020 


CAAGTGAAGA 


GATCGGTTCT 


TTTGAATGCT 


CAGGTTATGT 


CATGAGTGGT 


AGCAGGCATC 


1080 


GCAGAATGGA 


GGCTGTACGA 


CTGCGTAAGG 


AGAACCAGAT 


CTACAGTGCT 


GATGAGAAGA 


1140 


GAGCTCTTGC 


ATCCTTTAAC 


CAAGAAGAGA 


GACGAAAGAG 


AGAAAGTAAG 


ATTTTAGCCA 


1200 


GTTTCCGAGA 


GATGGTGCAC 


AAAAAGACAA 


AAGAGAAAGA 


TGACAAGTAA 


GGACTTACTT 


1260 


GTTGCACAGC 


AGGAATTTTA 


ACAACAAAAA 


TTTTATGTGA 


CCAAAAGTGT 


TAAAAGGCTT 


1320 


TACAGTGCTA 


CTGTACTTAC 


CATATTAGTA 


AGTCCCTCAG 


GAAAAAGCTT 


CTTTTGAGAT 


1380 
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ATCTTTAGCA GCTTATTTTT TGTTATTTTA 
TAAAAATATT C AAC CAT TAT AGGAGGAGAG 
CCTGACACCT TTCCCCCAAA AATATATATT 
GTGCATATAC AAGAGTATAT GTTGCAGCAT 
GCTCGTTA 



ACTTTAAAAA GTAATATGTG CACATGGTTT 14 4 0 
TTAGTAAAAA GTGAATCTTT CACTTTAGCC 1500 
TTGGTGTCTT ATATACAGAA TATACATTCT 1560 
AAAG AT T AAA AG C TAT T AAA GTTTTTTTTC 1620 

1628 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SPLNFET02 

(B) CLONE: 2203226 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125 : 
GTGGCGGCGG CGAAGGATGC ACCCGGCAGG CTTGGCGGCG GCGGCTGCGG GGACGCCCCG 60 
GCTGCCCTCG AAGCGGAGGA TCCCTGTGTC CCAGCCGGGC ATGGCCGACC CCCACCAGCT 120 
TTTCGATGAC ACAAGTTCAG CCCAGAGCCG GGGCTATGGG GCCCAGCGGG CACCTGGTGG 180 
CCTGAGTTAT CCTGCAGCCT CTCCCACGCC CCATGCAGCC TTCCTGGCTG ACCCGGTGTC 24 0 
CAACATGGCC ATGGCCTATG GGAGCAGCCT GGCCGCGCAG GGCAAGGAGC TGGTGGATAA 300 
GAACATCGAC CGCTTCATCC CCATCACCAA GCTCAAGTAT TACTTTGCTG TGGACACCAT 3 60 
GTATGTGGGC AGAAAGCTGG GCCTGCTGTT CTTCCCCTAC CTACACCAGG ACTGGGAAGT 420 
GCAGTACCAA CAGGACACCC CGGTGGCCCC CCGCTTTGAC GTCAATGCCC CGGACCTCTA 4 80 
CATTCCAGCA ATGGCTTTCA TCACCTACGT TTTGGTGGCT GGTCTTGCGC TGGGGACCCA 54 0 
GGATAGGTTC TCCCCAGACC TCCTGGGGCT GCAAGCGAGC TCAGCCCTGG CCTGGCTGAC 600 
CCTGGAGGTG CTGGCCATCC TGCTCAGCCT CTATCTGGTC AC T G T C AAC A CCGACCTCAC 660 
CACCATCGAC CTGGTGGCCT TCTTGGGCTA CAAATATGTC GGGATGATTG GCGGGGTCCT 720 
CATGGGCCTG CTCTTCGGGA AGATTGGCTA CTACCTGGTG CTGGGCTGGT GCTGCGTGGC 7 80 
CATCTTTGTG TTCATGATCC GGACGCTGCG GCTGAAGATC TTGGCAGACG CAGCAGCTGA 840 
GGGGGTCCCG GTGCGTGGGG CCCGGAACCA GCTGCGCATG TACCTGACCA TGGCGGTGGC 900 
GGCGGCGCAG CCTATGCTCA TGTACTGGCT CACCTTCCAC CTGGTGCGGT GAGCGCGCCC 960 
GCTGAACCTC CCGCTGCTGC TGCTGCTGCT GGGGGCCACT GTGGCCGCCG AACTCATCTC 1020 
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CTGCCTGCAG GCCCCAAGGT CCACCCTGTC 
CCGCCCAGCC CCGCCCCCAA CCCAAGGTGC 
CCCAGGGCGT GGCCGCTGTT ACAGAAACAA 



O 

TGGCCACAGG CACCGCCTCC ATCCCATGTC 108 0 
T GAGA GAT CT CCAGCTGCAC AGGCCACCGC 114 0 
TAAACCCTGA TGGGCATGGC AAAAAAAAAA 1200 



(2) INFORMATION FOR SEQ ID NO: 12 6; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PROSNOT16 

(B) CLONE: 2232884 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126 : 
AGAGCCCCAG CCACGCCGGC CCAGGTGGCC TCAGGTGAGG GGGGGCGGAC GCACCTGTGG 60 
GGACGGGACG ACGAGTTCAA GCCTCCGTGG GTGCAGTTGG TCGCCAGCGA GGGATGCGGA 120 
GACGCCCCTG AACGACCATG GCATCGGCCG ACGAGCTGAC CTTCCATGAA TTCGAGGAGG 180 
CCACTAATCT TCTGGCTGAC ACCCCAGATG CAGCCACCAC CAGCAGAAGC GATCAGCTGA 24 0 
CCCCACAAGG GCACGTGGCT GTGGCCGTGG GCTCAGGTGG CAGCTATGGA GCCGAGGATG 300 
AGGTGGAGGA GGAGAGTGAC AAGGCCGCGC TCCTGCAGGA GCAGCAGCAG CAGCAGCAGC 3 60 
CGGGATTCTG GACCTTCAGC TACTATCAGA GCTTCTTTGA CGTGGACACC TCACAGGTCC 420 
TGGACCGGAT CAAAGGCTCA CTGCTGCCCC GGCCTGGCCA CAACTTTGTG CGGCACCATC 4 80 
TGCGGAATCG GCCGGATCTG TATGGCCCCT TCTGGATCTG TGCCACGTTG GCCTTTGTCC 54 0 
TGGCCGTCAC TGGCAACCTG ACGCTGGTGC TGGCCCAGAG GAGGGACCCC TCCATCCACT 600 
ACAGCCCCCA GTTCCACAAG GTGACCGTGG CAGGCATCAG CATCTACTGC TATGCGTGGC 6 60 
TGGTGCCCCT GGCCCTGTGG GGCTTCCTGC GGTGGCGCAA GGGTGTCCAG GAGCGCATGG 720 
GGCCCTACAC CTTCCTGGAG ACTGTGTGCA TCTACGGCTA CTCCCTCTTT GTCTTCATCC 7 80 
CCATGGTGGT CCTGTGGCTC ATCCCTGTGC CTTGGCTGCA GTGGCTCTTT GGGGCGCTGG 84 0 
CCCTGGGCCT GTCAGCCGCC GGGCTGGTAT TCACCCTCTG GCCCGTGGTC CGTGAGGACA 900 
CCAGGCTGGT GGCCACAGTG CTGCTGTCCG TGGTCGTGCT GCTCCACGCC CTCCTGGCCA 960 
TGGGCTGTAA GTTGTACTTC TTCCAGTCGC TGCCTCCGGA GAACGTGGCT CCTCCACCCC 1020 
AAATCACATC TCTGCCCTCA AACATCGCGC TGTCCCCTAC CTTGCCGCAG TCCCTGGCCC 10 80 
CCTCCTAGGA AGG 1093 
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(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1121 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: COLNNOTll 

(B) CLONE: 2328134 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127 : 
GCGGGGGATG ACGCCACGGA CATGGTGGCC GAGACCGGCG GGGTGGGGGA CGTGTCGCGC 60 
GGCCGGGTGG CCTCGGTCGG TACCCTGGGC GCGGACAGCT GCCTCATTAG TATTCGTACC 120 
CACGAGGCGG CGCAGCGGGC CCTCGGGGAC AGCGAGCGTC GCGGCTATGG CTTATCACTC 180 
GGGCTACGGA GCCCACGGCT CCAAGCACAG GGCCCGGGCA GCCCCGGATC CCCCTCCCCT 24 0 
CTTCGATGAC ACAAGCGGTG GTTATTCCAG CCAGCCCGGG GG AT AC C C AG CCACAGGAGC 300 
AGACGT GGCC TTCAGTGTCA ACCACTTGCT TGGGGACCCA ATGGCCAATG TGGCTATGGC 3 60 
CTATGGCAGC TCCATCGCAT CCCATGGGAA GGACATGGTG CACAAGGAGC TGCACCGTTT 420 
TGTGTCTGTG AGCAAACTCA AGTATTTTTT TGCTGTGGAC ACAGCCTACG TGGCCAAGAA 4 80 
GCTAGGGCTG CTGGTCTTCC CCTACACACA CCAGAACTGG GAAGTGCAGT ACAGTCGTGA 5 40 
TGCTCCTCTG CCCCCCCGGC AAGACCTCAA CGCCCCTGAC CTCTATATCC CCACGATGGC 600 
CTTCAT TACT TACGTGCTCC TGGCTGGGAT GGCACTGGGC AT T CAGAAAA GGTTCTCCCC 660 
GGAGGTGCTG GGCCTGTGTG CAAGCACAGC GCTGGTGTGG GTGGTGATGG AGGTGCTGGC 7 20 
CCTGCTCCTG GGCCTCTACC TGGCCACCGT GCGCAGTGAC CTGAGCACCT TTCACCTGCT 7 80 
GGCCTACAGT GGCTACAAAT ACGTGGGAAT GATCCTCAGT GTGCTCACGG GGCTGCTGTT 840 
CGGCAGCGAT GGCTACTACG TGGCGCTGGC CTGGACCTCA TCGGCGCTCA TGTACTTCAT 900 
TGTGCGCTCT TTGCGGACAG CAGCCCTGGG CCCCGACAGC ATGGGGGGCC CCGTCCCCCG 960 
GCAGCGTCTC CAGCTCTACC TGACTCTGGG AGCTGCAGCC TTCCAGCCCC T CAT CAT AT A 1020 
CTGGCTGACT TTCCACCTGG TCCGGTGACC CCCTGGCCCC AGATGGCACT GAGTTTTTCA 1080 



TTCATTGAAG 



ATTTGATTTC 



CTTGAAAAAA AAAAAAAAAG G 



1121 



(2) INFORMATION FOR SEQ ID NO: 



128: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ISLTNOT01 

(B) CLONE: 2382718 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128 : 
CGCGGACTGT GTCTGTTCCC AGGAGTCCTT CGGCGGCTGT TGTGTCAGTG GCCTGATCGC 60 
GATGGGGACA AAGGCGCAAG TCGAGAGGAA ACTGTTGTGC CTCTTCATAT TGGCGATCCT 120 
GTTGTGCTCC CTGGCATTGG GCAGTGTTAC AGTGCACTCT TCTGAACCTG AAGTCAGAAT 18 0 
TCCTGAGAAT AATCCTGTGA AGTTGTCCTG TGCCTACTCG GGCTTTTCTT CTCCCCGTGT 24 0 
GGAGTGGAAG TTTGACCAAG GAGACACCAC CAGACTCGTT TGCTATAATA ACAAGATCAC 300 
AGCTTCCTAT GAGGACCGGG TGACCTTCTT GCCAACTGGT ATCACCTTCA AGTCCGTGAC 360 
ACGGGAAGAC ACTGGGACAT ACACTTGTAT GGTCTCTGAG GAAGGCGGCA ACAGCTATGG 42 0 
GGAGGTCAAG GTCAAGCTCA TCGTGCTTGT GCCTCCATCC AAGCCTACAG TTAACATCCC 4 80 
CTCCTCTGCC ACCATTGGGA ACCGGGCAGT GCTGACATGC TCAGAACAAG ATGGTTCCCC 54 0 
ACCTTCTGAA TACACCTGGT TCAAAGATGG GATAGTGATG CCTACGAATC CCAAAAGCAC 600 
CCGTGCCTTC AGCAACTCTT CCTATGTCCT GAATCCCACA ACAGGAGAGC TGGTCTTTGA 660 
TCCCCTGTCA GCCTCTGATA CTGGAGAATA CAGCTGTGAG GCACGGAATG GGTATGGGAC 720 
ACCCATGACT TCAAATGCTG TGCGCATGGA AGCTGTGGAG CGGAATGTGG GGGTCATCGT 78 0 
GGCAGCCGTC CTTGTAACCC TGATTCTCCT GGGAATCTTG GTTTTTGGCA TCTGGTTTGC 84 0 
CTATAGCCGA GGCCACTTTG ACAGAACAAA GAAAGGGACT TCGAGTAAGA AGGTGATTTA 900 
CAGCCAGCCT AGTGCCCGAA GTGAAGGAGA ATTCAAACAG ACCTCGTCAT TCCTGGTGTG 960 
AGCCTGGTCG GCTCACCGCC TATCATCTGC ATTTGCCTTA CTCAGGTGCT ACCGGACTCT 1020 
GGCCCCTGAT GTCTGTAGTT TCACAGGATG CCTTATTTGT CTTCTACACC CCACAGGGCC 108 0 
CCCTACTTCT TCGGATGTGT TTTTAATAAT GTCAGCTATG TGCCCCATCC TCCTTCATGC 114 0 
CCTCCCTCCC TTTCCTACCA CTGCTGAGTG GCCTGGAACT TGTTTAAAGT GTTTATTCCC 1200 
CATTTCTTTG AGGGATCAGG AAGGAATCCT GGGTATGCCA TTGACTTCCC TTCTAAGTAG 12 60 
AC AG C AAAAA TGGCGGGGGT CGCAGGAATC TGCACTCAAC TGCCCACCTG GCTGGCAGGG 1320 
ATCTTTGAAT AGGTATCTTG AGCTTGGTTC TGGGCTCTTT CCTTGTGTAC TGACGACCAG 1380 
GGCCAGCTGT TCTAGAGCGG GAATTAGAGG CTAGAGCGGC TGAAATGGTT GTTTGGTGAT 14 4 0 
GACACTGGGG TCCTTCCATC TCTGGGGCCC ACTCTCTTCT GTCTTCCCAT GGGAAGTGCC 1500 
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ACTGGGATCC CTCTGCCCTG TCCTCCTGAA 
GGAAAATGGG AGCTCTTGTT GTGGAGAGCA 
AGGATTTAAA ACCGCTGCTC TAAAGAAAAG 
CCTATAATCC CAGAGGCTGA GGCAGGCGGA 
TGACCAACAT GGAGAAACCC TACTGAGAAT 
CTGTAATCCC AGCTGCTCAG GAGCCTGGCA 
A 




TACAAGCTGA CTGACATTGA CTGTGTCTGT 15 60 
TAGTAAATTT TCAGAGAACT TGAAGCCAAA 1620 
AAAACTGGAG GCTGGGCGCA GTGGCTCACG 1680 
TCACCTGAGG TCGGGAGTTC GGGATCAGCC 17 4 0 
ACAAAGTTAG CCAGGCATGG TGGTGCATGC 18 00 
ACAAGAGCAA AACTCCAGCT CAAAAAAAAA 18 60 

1861 



(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1975 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ENDANOT01 

(B) CLONE: 2452208 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129 : 

GTTTGGAGGA GACTCGGATA TACCTTCTCA GAAGCTGCAC AGGAGGAAAG C AG T G AC AAA 60 

GAAAGAAGTT GTCATTCTTT GCACGAAACT GGATGGCTTC TACAGGGAGC CAGGCCTCTG 12 0 

AT AT AG AC GA GATTTTTGGA TTCTTCAACG ATGGCGAACC TCCCACCAAA AAGCCCAGGA 180 

AGCTGCTTCC AAGCTTAAAA AC T AAGAAGC CTCGAGAACT TGTGCTAGTG ATTGGAACAG 24 0 

GCATTAGTGC TGCAGTTGCG CCCCAAGTTC CAGCCCTCAA ATCCTGGAAG GGGTTAATTC 300 

AGGCCTTACT GGATGCTGCC ATTGATTTTG ATCTTTTAGA AGATGAGGAG AGCAAAAAGT 360 

TTCAGAAATG TCTCCATGAA GACAAGAACC TGGTCCATGT TGCCCATGAC CTTATCCAGA 42 0 

AACTCTCTCC TCGTACCAGT AATGTTCGAT CCACATTTTT CAAGGACTGT TTATATGAAG 4 80 

TATTTGATGA CTTGGAGTCA AAGATGGAAG ATTCTGGAAA ACAGCTACTT CAGTCAGTTC 54 0 

TCCACCTGAT GGAAAATGGA GCCCTCGTAT TAACTACAAA TTTTGATAAT CTCTTGGAAC 600 

TGTATGCAGC AGATCAGGGG AAACAGCTTG AATCCCTTGA CCTTACTGAT GAGAAAAAGG 660 

TCCTCGAGTG GGCTCAGGAG AAGCGTAAGC TGAGCGTGTT GCATATTCAC GGAGTCTACA 720 

CCAACCCTAG TGGCATTGTC CTTCATCCGG CTGGATATCA GAACGTGCTC AGGAACACTG 7 80 

AAGTCATGAG AGAAAT T C AG AAACTCTACG AAAACAAGTC ATTTCTTTTC CTGGGCTGTG 840 

GCTGGACTGT GGATGACACC ACTTTCCAGG CCCTTTTCTT GGAGGCTGTC AAGCATAAAT 900 
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CTGACCTAGA ACATTTCATG CTGGTTCGGA GAGGAGACGT AGATGAGTTC AAAAAGCTTC 960 
GAGAAAACAT GCTGGACAAG GGGATTAAAG TCATCTCCTA TGGAGATGAC TATGCCGATC 1020 
TTCCAGAATA TTTCAAGCGA CTGACATGTG AGATCTCCAC AAGGGGTACA TCAGCAGGGA 1080 
TGGTGAGAGA AGGTCAGCTA AATGGCTCAT CTGCAGCACA CAGTGAAATA AGAGGCTGTA 114 0 
GTACATGAGC GAGCTAGAGA AATCACCACC GTTTAGACCA AGCTGTAAGG CCCTACTACA 1200 
GACAGTGTTT AAC AAG T AAA CTTACAAGAA CCCAACACAA TTCCCAGAAA GTAACAATAG 12 60 
CCAGAGGTTG AAGGGCGGGG TAGAAGAGGG GGGAATGTTG CAGCGTAATC CTTCATACCA 1320 
CCTGGTTCTT GATATTCTGC CGCCTGTTCA AGTTCAAGAA TAAAAGCGAC AGCAGGACCC 138 0 
AAATGCAGCT CCCAACCCAC TCCCCAGGCT AGACATGCTT GTGTCCACAC AGCACACCAA 14 4 0 
TGTGATACTT CCACTGACCG GCTGCAGCTC TGCATGAAGG ACTCGGGGTC TGGATGCCAT 1500 
GGAATCACTG TGGCTCTTGT TGCAGTTTTG TACTCTATAC TTGGTTTTTC AATTAAGCTT 15 60 
AATGGCTTTT TTAAAACATG ACTTGAAGCT CTAGTTTTCT AGATCTTTTA CAGTGTACAG 1620 
TATTTTACAT AACTAAGCTG TATTAAAAGC TTGTTCATTT ACTTGCCAGG ACCCTGGCTC 168 0 
TACTTTTAGA GTCATTGTAA GAAACTCTAA CTTGCATCAA GGTACTAATA AGCTTAATTT 17 4 0 
TAATAACCCA AAGTTTAAAG GTTCCGATCT TTCTCCTTGG GGTGGAGTGA TCTCATTCTC 1800 
AGGACAACCG TTTACTTACC TGATTCCTCG GAGCATTATC AACTTCTGCT CTGTTGTCCT 18 60 
G AC CAT AC AT ATGTCCTAGA ACTACAGTTA AGTGTGTTGT GGAATTTTAG TTTTGAATCC 1920 
GGAATAAATG AAGTCCCAGG ACTCAAAGAA GAGAGAAAAA AAAAAAGGGG GCCCC 1975 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ENDANOT01 

(B) CLONE: 2457825 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 : 
TCTACTGTCC CCTGCCCTGT ACCCCCAGGC ATTGATCTGG AGAACATTGT GTACTACAAG 60 
GACGACACCC ACTACTTTGT GATGACAGCC AAGAAGCAGT GCCTGCTGCG GCTGGGGGTG 120 
CTGCGCCAGG ACTGGCCAGA CACCAATCGG CTGCTGGGCA GTGCCAATGT GGTGCCCGAG 180 
GCTCTGCAGC GCTTTACCCG GGCAGCTGCT GACTTTGCCA CCCATGGCAA GCTCGGGAAA 24 0 
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CTAGAGTTTG CCCAGGATGC CCATGGGCAG 
ATGATGCGGG CAGAGAGTTC TGCTCGTGTG 
GGACTGGTGG GGGACTGCCT GGTGGAGCCC 
GGCTTCCTGG CAGCCTTTGA TGCAGCCTGG 
TCCCTAGAGG TGTTGGCTGA GCGTGAGAGC 
GAAAACATGC ATCGCAATGT GGCCCAGTAT 
CTGAACCTCC GGGCAGTGAC CCCCAATCAG 
GAGCCTGTGC AGAGGGACAA CGACAAGACA 
GGCACCCAGG AGGAGCTGCT ACGCTGGTGC 
CACGTCTCCG ATTTGTCTTC CTCCTGGGCT 
CGGCTGCAGC CTGGCCTGCT GGAACCCTCA 
ACTGCTTGGG CAC T AAAG G T GGCAGAGAAT 
CAGGCCGTGG TAGCAGGGAG TGACCCACTG 
AGTGCCTTCA AGAGCATGGC CCACAGCCCA 
TCCAGTGCTG TATTATTCCT TAGTAAACTT 
GAAAAT GCAG AGGATGCTGG TGGCAAGAAG 
AG TACT GAG G TGCCACCTGA CCCAGAGCCT 
CAGGAGGCCG GTGCTGGGGA CCTGTGTGCA 
CGCCTCTGTG TCAACGGCCA TTTCTTCCAC 
GCCACACTGT GGCCAGGTGG CTACGAGCAG 
TTCTCAGCTC TTGTGGCCAT GGAGAAGGAG 
GAAGAAGAAG ATGTGCCTTT GGACTCAGAT 
ACCTCAGGCA CCATGAATAA CTACCCAACA 
GAGGAGGAGA TGAAGAGGTT CTGCAAGGCC 
GAGGCTGCCT TGAGGGAGCT AGAGGCCGAG 
CAGAGCAGTT CCCCAGAACA GCAAAAGAAA 
GACAAGAAAA ACAGCCTGGT GGCTGAGGAG 
AATCTGGAGG AG AAAC AG T G GCAGCTGGAC 
GAAAACCTAA AGACAGCTGC TGATCGGCAG 
GATTTGGTCA ACCAGAGAGA TGCCCTCATC 
CTGGCCTTGG GGACAGGGGC CCAGGGCTAG 




CCTGATGTCT CTGCCTTTGA CTTCACGAGC 300 
CAAGAGAAGC ATGGCGCCCG CCTGCTGCTG 360 
TTCTGGCCCC TGGGCACTGG AGTGGCACGG 420 
ATGGTGAAGC GGTGGGCAGA GGGCGCTGAG 48 0 
CTGTACCAGC TTCTGTCACA GACATCCCCA 54 0 
GGGCTGGACC CAGCCACCCG CTACCCCAAC 600 
GTACGAGACC TGTATGATGT GCTAGCCAAG 660 
GATACAGGGA TGCCAGCCAC CGGGTCGGCA 720 
CAGGAGCAGA CAGCTGGGTA CCCGGGAGTC 780 
GATGGGCTAG CTCTGTGTGC CCTGGTGTAC 84 0 
GAGCTGCAGG GGCTGGGAGC TCTGGAAGCA 900 
GAGCTGGGCA TCACACCGGT GGTGTCTGCA 960 
GGCCTCATTG CCTACCTCAG CCACTTCCAC 1020 
GGCCCTGTCA GCCAGGCCTC CCCAGGGACC 1080 
CAGAGGACCC TGCAGCGATC CCGGGCCAAG 114 0 
CTGCGCTTGG AGATGGAGGC CGAGACCCCA 1200 
GGTGTACCCC TGACACCCCC ATCCCAACAC 12 60 
CTTTGTGGGG AACACCTCTA TGTCCTGGAA 1320 
CGGAGCTGCT TCCGCTGCCA TACCTGTGAG 1380 
CACCCAGGCA GTAGAACGTC TCAGTTCTTC 14 4 0 
GAAAAAGAGA GTCCCTTCTC CAGTGAAGAG 1500 
GTGGAACAGG CCCTGCAGAC CTTTGCCAAG 15 60 
TGGCGTCGGA CTCTGCTGCG CCGTGCGAAG 162 0 
CAGACCATCC AACGGCGACT AAATGAGATT 1680 
GGCGTGAAGC TGGAGCTGGC CTTGAGGCGC 17 4 0 
CTATGGGTAG GACAGCTGCT ACAGCTCGTT 18 00 
GCCGAGCTCA TGATCACGGT GCAGGAATTG 18 60 
CAGGAGCTAC GAGGCTACAT GAACCGGGAA 1920 
GCTGAGGACC AGGTCCTGAG GAAGCTGGTG 198 0 
CGCTTCCAGG AGGAGCGCAG GCTCAGCGAG 204 0 
ACGAGGGTGG GCCGTCTGCT TTCGTTCCCA 2100 
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CAAAGAAAGC ACCTCACCCC AGCACAGTGC CACCCCTGTT CATCTGGGCT GCCTGGCAGA 2160 



(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP1NOT03 

(B) CLONE: 2470740 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131 : 



GAGGAAGAAG 


AGGAAGAGGG 


GGCTCCGATT 


GGGACCCCTA 


GGGATCCTGG 


AGATGGTTGT 


60 


CCTTCCCCCG 


ACATCCCTCC 


TGAACCCCCT 


CCAACACACC 


TGAGGCCCTG 


CCCTGCCAGC 


120 


CAGCTCCCTG 


GACTCCTGTC 


CCATGGCCTC 


CTGGCCGGCC 


TCTCCTTTGC 


AGTGGGGTCC 


180 


TCCTCTGGCC 


TCCTGCCCCT 


CCTGCTGCTG 


CTGCTGCTTC 


CATTGCTGGC 


AGCCCAGGGT 


240 


GGGGGTGGCC 


TGCAGGCAGC 


GCTGCTGGCC 


CTTGAGGTGG 


GGCTGGTGGG 


TCTGGGGGCC 


300 


TCCTACCTGC 


TCCTTTGTAC 


AGCCCTGCAC 


CTGCCCTCCA 


GTCTTTTCCT 


ACTCCTGGCC 


360 


CAGGGTACCG 


CACTGGGGGC 


CGTCCTGGGN 


CATGAGCTGG 


CGCCGAAGGC 


TCATGGGTGT 


420 


TCCCCTGGGG 


CTTTGGAACT 


GCCTGGTTCT 


TAAGCTTNGG 


CAAGGCCTAG 


CTCCAACCTC 


480 


TGGTGGCTAA 


TGGCANCCGG 


GGGGGAANAT 


GGGTTCNGGA 


AAAAGGGCCC 


CCGGGTTTCA 


540 



CCGGGG 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 581 base pairs 

(B) TYPE': nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SMCANOT01 

(B) CLONE: 2479092 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 : 
GCCATGGAGG CCCTGAGGAG GGCCCACGAG GTCGCGCTCC GCCTGCTGCT GTGTAGGCCG 60 
TGGGCCTCGC GCGCCGCCGC CCGCCCCAAG CCCAGCGCCT CGGAGGTGCT GACGCGGCAT 12 0 



256 



• 



PF-0459 OS 

CTGCTGCAGC GGCGCCTGCC GCACTGGACC TCCTTCTGCG TGCCCTACAG CGCCGTCCGC 180 
AACGACCAGT TCGGCCTCTC GCACTTCAAC TGGCCGGTGC AGGGCGCCAA CTACCACGTC 24 0 
CTGCGCACCG GCTGCTTCCC CTTCATCAAG TACCACTGCT CCAAGGCTCC CTGGCAGGAC 300 
CTGGCCCGGC AGAACCGCTT CTTCACGGCG CTCAAGGTCG TCAACCTCGG TATTCCAACT 360 
TTATTATATG GACTTGGCTC CTGGTTATTT G C C AG AG T C A CAGAGACTGT GCATACCAGT 4 20 
TATGGACCCA TAACAGTTTA TTTTCTCAAT AAAGAAGATG AAGGTGCCAT GTATTGAAAG 4 80 
TGTGCGTCAA AGAACATAAA TATCAGTGGA TTTTCTCTGT GTATATGTGC AG TAT T TAT T 54 0 
TTTGATCCTT TAAAATAAAA CTTTTGCAAA TAAAAAAAAA A 581 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1259 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SMCANOT01 

(B) CLONE: 2480544 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 : 
GGGCTGGGCC CCGCCGCAGC TCCAGCTGGC CGGCTTGG'TC CTGCGGTCCC TTCTCTGGGA 60 
GGCCCGACCC CGGCCGCGCC CAGCCCCCAC CATGCCACCC GCGGGGCTCC GCCGGGCCGC 120 
GCCGCTCACC GCAATCGCTC TGTTGGTGCT GGGGGCTCCC CTGGTGCTGG CCGGCGAGGA 180 
CTGCCTGTGG TACCTGGACC GGAATGGCTC CTGGCATCCG GGGTTTAACT GCGAGTTCTT 2 40 
CACCTTCTGC TGCGGGACCT GCTACCATCG GTACTGCTGC AGGGACCTGA CCTTGCTTAT 300 
CACCGAGAGG CAGCAGAAGC ACTGCCTGGC CTTCAGCCCC AAGACCATAG CAGGCATCGC 3 60 
CTCAGCTGTG ATCCTCTTTG TTGCTGTGGT TGCCACCACC ATCTGCTGCT TCCTCTGTTC 42 0 
CTGTTGCTAC CTGTACCGCC GGCGCCAGCA GCTCCAGAGC CCATTTGAAG GCCAGGAGAT 4 80 
TCCAATGACA GGCATCCCAG TGCAGCCAGT ATACCCATAC CCCCAGGACC CCAAAGCTGG 54 0 
CCCTGCACCC CCACAGCCTG GCTTCATGTA CCCACCTAGT GGTCCTGCTC CCCAATATCC 600 
ACTCTACCCA GCTGGGCCCC CAGTCTACAA CCCTGCAGCT CCTCCTCCCT ATATGCCACC 660 
ACAGCCCTCT TACCCGGGAG CCTGAGGAAC CAGCCATGTC TCTGCTGCCC CTTCAGTGAT 720 
GCCAACCTTG GGAGATGCCC TCATCCTGTA CCTGCATCTG GTCCTGGGGG TGGCAGGAGT 7 80 
CCTCCAGCCA CCAGGCCCCA GACCAAGCCA AGCCCTGGGC CCTACTGGGG ACAGAGCCCC 840 
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AGGGAAGTGG AACAGGAGCT GAACTAGAAC 
ATGGGCTATT TTTACTGGGG GCAAGGGAGG 
TTCAAATAGT CCCTCTGCTC CCAAGATCCC 
CCCCTCTGGG CTGGGGTGGG GGGAGGGAGG 
CTCTCTGGCT GCCCCACTGG CCACATCTCT 
ACTCATATCA GTCGCATCAT TGGACCCATC 
GGCCCAGACT GTTGCCCACT CCATATTCCA 



TATGAGGGGT TGGGGGGAGG GCTTGGAATT 900 

GAGATGACAG CCTGGGTCAC AGTGCCTGTT 960 

AGCCAGGAAG GCTGGGGCCC TACTGTTTGT 1020 

AGGTTCCGTC AGCAGCTGGC AGTAGCCCTC 1080 

GGCCTGCTAG ATTAAAGCTG TAAAGACATA 114 0 

CACACCTTCC AGGAACACCG NCTTCAGCTG 1200 

AAAGTAGGGG AGGGCCAGCA CCAGCATCG 1259 



(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRAITUT21 

(B) CLONE: 2518547 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134 : 
CGGCTCGAGG CCGCAGCCCC ATGGACAGTC TTCTGCACCC CCGGGAGCGC CCTGGATCCA 60 
CTGCCTCCGA GAGCTCAGCC TC^CTGGGCA GTGAGTGGGA CCTCTCAGAA TCTTCTCTCA 120 
GCAACCTGAG TCTTCGCCGT TCCTCAGAGC GCCTCAGTGA CACCCCTGGA TCCTTCCAGT 180 
CACCTTCCCT GGAAATTCTG CTGTCCAGCT GCTCCCTGTG CCGTGCCTGT GATTCGCTGG 2 40 
TGTATGATGA GGAAATCATG GCTGGCTGGG CACCTGATGA CTCTAACCTC AACACAACCT 300 
GCCCCTTCTG CGCCTGCCCC TTTGTGCCCC TGCTCAGTGT CCAGACCCTT GATTCCCGGC 360 
CCAGTGTCCC CAGCCCCAAA TCTGCTGGTG CCAGTGGCAG CAAAGATGCT CCTGTCCCTG 420 
GTGGTCCTGG CCCTGTGCTC AGTGACCGAA GGCTCTGCCT TGCTCTGGAT GAGCCCAGCT 48 0 
CTGCAACGGG CACATGGGGG GAGCCTCCCG GCGGGTTGAG AGTGGGGCAT GGGCATACCT 54 0 
GAGCCCCCTG GTGCTGCGTA AGGAGCTGGA GTCGCTGGTA GAGAACGAGG GCAGTGAGGT 600 
GCTGGCGTTG CCTGAACTGC CCTCTGCCCA CCCCATCATC TTCTGGAACC TTTTGTGGTA 660 
TTTCCAACGG CTACGCCTGC CCAGTATTCT ACCAGGCCTG GTGCTGGCCT CCTGTGATGG 720 
GCCTTCGCAC TCCCAGGCCC CATCTCCTTG GCTAACCCCT GATCCAGCCT CTGTTCAGGT 78 0 
ACGGCTGCTG TGGGATGTAC TGACCCCTGA CCCCAATAGC TGCCCACCTC TCTATGTGCT 84 0 
CTGGAGGGTC CACAGCCAGA TCCCCCAGCG GGTGGTATGG CCAGGCCCTG TACCTGCATC 900 
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CCTTAGTTTG GCACTGTTGG AGTCAGTGCT 
GGCTGTGGGG CTCCTGCTGG AAACTCTAGG 
GGGAATCTAC CGTGAGATAT TATTCCTGAC 
CATAGTGGCC TTCGATAAGA AGTACAAGTC 
CAAGGAGGAG CTGAGGCACC GGCGGGCGCA 
AAAATGTTTT GGAGCACCTC CAGAATGCTA 
GTGGGGAAGT GAGGAAGAAG GGATTCTAGA 
GTTGGGAACA GGCTGGGAAG GATGCCCAGT 
GGGATCCACT GTTACCAAAA GTCCTGATTC 
GCTGATGTTG GGGGAGATCT GGGGGGAGTT 
CCGGGAACTC CCCTCCAGGG TACCCACAGA 
TTGTTTTAAA AAACAACTGG AAAGATGCAG 
TAGGGATGTC ATTCTCCACC AATAATGGTC 
AGGCTCTCCA TGCCTTTCTA CCTAAGTGTT 
GCCACAGCCC CCTTGCTTAT GAGGTTCTTA 
CACCATGGTC CGGTGGTTTG TAGTTCCTTC 
CAAGCTCCCC TTAGGAAGAA CTGGTGCCCC 
CCTTGGGGAA TGCCTCACCC ACCCAGGTCC 
AAGTTTTGTT GGATGTAAAT AT AG T AAAAG 



GCGCCATGTT GGACTCAATG AAGTGCACAA 960 
GCCCCCACCC ACTGGCCTGC ACCTGCAGAG 1020 
AATGGCTGCT CTGGGCAAGG ACCACGTGGA 1080 
TGCCTTTAAC AAGCTGGCCA GCAGCATGGG 114 0 
GATGCCCACT CCCAAGGCCA TTGACTGCCG 12 00 
GAGACCTTAA GCTTCCCTCT CCAGCCTAGG 12 60 
GTTAAACTGC CTCCCTGTTG CCTTCATGGA 132 0 
CAAAGGCTCC AAGCGAGGAC AACAGGAAGA 1380 
CCCCATCACC AACCTACCCA GTTTGTTCGT 14 4 0 
GGTACAGCTC TGTTCTTCCC TTGTCCTATA 15 00 
TCTGCATTGC CCTGGTCATT TTAGAAGTTT 15 60 
AGC TACT GAG CCTTTGCCCT GAATGGGAGG 1620 
CCTCTTCCCT GACGTTGCTG AAGGAGCCCA 1680 
TGTATTTTAT TTTAAATTAT TTATTCTGGA 17 4 0 
TGGAGAGTGA GAAAGGGAAG GGAAATAGGG 1800 
AAAGTCAGGC ACTGGGAGCT AGAGGAGTCT 18 60 
CTCCAGTCCT AATTTTTCTT GCCTGCCCCG 192 0 
TGACCTGTGC AATAAGGATT GTTCCCTGCG 198 0 
CTGCTTCTGT CTTTTTCAAA AAA 2033 



(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GBLANOT02 

(B) CLONE: 2530650 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 : 

GCCCACTGGG CTCTCCCGGC TGCAGTGCCA GGGCGCAGGA CGCGGCCGAT CTCCCGCTCC 60 

CGCCACCTCC GCCACCATGC TGCTCCCCCA GCTCTGCTGG CTGCCGCTGC TCGCTGGGCT 120 

GCTCCCGCCG GTGCCCGCTC AGAAGTTCTC GGCGCTCACG TTTTTGAGAG TGGATCAAGA 180 
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TAAAGACAAG GATTGTAGCT TGGACTGTGC GGGTTCGCCC CAGAAACCTC TCTGCGCATC 24 0 
TGACGGAAGG ACCTTCCTTT CCCGTTGTGA ATTTCAACGT GCCAAGTGCA AAGATCCCCA 300 
GC TAG AG ATT GCATATCGAG GAAACTGCAA AGACGTGTCC AGGTGTGTGG CCGAAAGGAA 3 60 
GTATACCCAG GAGCAAGCCC GGAAGGAGTT TCAGCAAGTG TTCATTCCTG AGTGCAATGA 420 
CGACGGCACC TACAGTCAGG TCCAGTGTCA CAGCTACACG GGATACTGCT GGTGCGTCAC 4 80 
GCCCAACGGG AGGCCCATCA GCGGCACTGC CGTGGCCCAC AAGACGCCCC GGTGCCCGGG 54 0 
TTCCGTAAAT G AAAAG T TAG CCCAACGCGA AGGCACAGGA AAAAC AG AT G ATGCCGCAGC 600 
TCCAGCGTTG GAGACTCAGC CTCAAGGAGA TGAAGAAGAT ATTGCATCAC GTTACCCTAC 660 
CCTTTGGACT GAACAGGTTA AAAGTCGGCA GAACAAAACC AATAAGAATT CAGTGTCATC 720 
CTGTGACCAA GAGCACCAGT CTGCCCTGGA GGAAGCCAAG CAGCCCAAGA ACGACAATGT 7 80 
GGTGATCCCT GAGTGTGCGC ACGGCGGCCT CTACAAGCCA GTGCAGTGCC ACCCCTCCAC 840 
GGGGTACTGC TGGTGCGTCC TGGTGGACAC GGGGCGCCCC ATTCCCGGCA CATCCACAAG 900 
GTACGAGCAG CCGAAATGTG ACAACACGGG CCAGGGCCCA CCCAGCCAAA GCCCGGGACC 960 
TGTACAAGGG CCGCCAGCTA CAAGGTTGTC CGGGTGCCAA AAAG CAT GAG TTTCTGACCA 1020 
GCGTTCTGGA CGCGCTGTCC ACGGACATGG TCCACGCCGC CTCCGACCCC TCCTCCTCGT 1080 
CAGGCAGGCT CTCAGAACCC GACCCCAGCC ATACCCTAGA GGAGCGGGTG GTGCACTGGT 114 0 
ACTTCAAACT AC TG GAT AAA AACTCCAGTG GAG AC AT C G G CAAAAAGGAA ATCAAACCCT 1200 
TCAAGAGGTT CCTTCGCAAA AAATCAAAGC CCAAAAAATG TGTGAAGAAG TTTGTTGAAT 12 60 
ACTGTGACGT GAATAATGAC AAATCCATCT CCGTACAAGA ACTGATGGGC TGCCTGGGCG 1320 
TGGCGAAAGA GGACGGCAAA GCGGACACCA AGAAACGCCA CACCCCCAGA GGTCATGCTG 138 0 
AAAGTACGTC TAATAGACAG CCAAGGAAAC AAGGATAAAT GGCTCATACC CCGAAGGCAG 14 4 0 
TTCCTAGACA CATGGGAAAT TTCCCTCACC AAAGAGCAAT TAAGAAAACA AAAACAGAAA 1500 
CACATAGTAT TTGCACTTTG TACTTTAAAT GTAAATTCAC TTTGTAGAAA TGAGCTATTT 1560 
AAACAGACTG TTTTAATCTG TGAAAATGGA GAGCTGGCTT CAGAAAATTA AT C AC AT ACA 1620 
ATGTATGTGT CCTCTTTTGA CCTTGGAAAT CTGTATGTGG TGGAGAAGTA TTTGAATGCA 168 0 
TTTAGGCTTA ATTTCTTCGC CTTCCACATG TTAACAGTAG AGCTCTATGC ACTCCGGCTG 17 4 0 
CAATCGTATG GCTTTCTCTA ACCCCTGCAG TCACTTCCAG ATGCCTGTGC TTACAGCATT 18 00 
GTGGAATCAT GTTGGAAGCT CCACATGTCC ATGGAAGTTT GTGATGTACG GCCGACCCTA 18 60 
CAGGCAGTTA ACATGCATGG GCTGGTTTGT TTCTTGGGAT TTTCTGTTAG TTTGTCTTGT 192 0 
TTTGCTTTCC AGAGATCTTG CTCATACAAT GAATCACGCA ACCACTAAAG CTATCCAGTT 1980 
AAGTGCAGGT AGTTCCCCTG GAGGAAATAA TATTTTCAAA CTGTCGTTGG TGTGATACTT 2 04 0 
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TGGCTCAAAG GATCTTTGCT TTTCCATTTT AAGCTTCTGT TTTGAGTTTT GCCCTGGGGC 2100 
TTGAATGAGT C C CAG AG AG T CGTTCGGATG GTGGGAGGCT GCCTAGGAGG CAGTAAATCC 2160 
AGTCACAGTG CCTGGGAGGG GCCCATCCTT CCAAAATGTA AATCCAGTCG CGGTGTGACC 2220 
GAGCTGGCTA ACAGGCTTGT CTGCCTGGTT TTCCTCCTAC ACGTGGACAT TATTCTCCTG 228 0 
ATCCTCCTAC CTGGTCCACC CCAGGGCTAC CGGAAGGTAA AATCTTCACC TGAACCAATT 234 0 
ATGAGCAGTC TCCTTACTGA AGGTACAGCC GGATACGTGG TGCCCCCGGG GCTGGTGTTG 2400 
GCAGCCGGGG GGAGGTGCCT GAGGGTCCCC ACGGTTCCTT TCTGCTTTTC TGAATGCATC 24 60 
AAGGGTACGA GAACTTGCCA ATGGGAAATT CATCCGAGTG GCACTGGCAG AGAAGGATAG 2520 
GAGTGGAATG CCCACACAGT GACCAACAGA ACTGGTCTGC GTGCATAACC AGCTGCCACC 258 0 
CTCAGGCCTG GGCCCCAGAG CTCAGGGCAC CCAGTGTCTT AAG G AAC CAT TTGGAGGACA 2 64 0 
GTCTGAGAGC AGGAACTTCA AGCTGTGATT CTATCTCGGC TCAGACTTTT GGTTGGAAAA 2700 
AGATCTTCAT GGCCCCAAAT CCCCTGAGAC ATGCCTTGTA GAATGATTTT GTGATGTTGT 27 60 
GATGCTTGTG GAGCATCGCG TAAGGCTTCT TGCTTATTTA AACTGTGCAA GGTAAAAATC 2 820 
AAGCCTTTGG AGCCACAGAA CCAGCTCAAG TACATGCCAA TGTTGTTTAA GAAACAGTTA 2 880 
TGATCCTAAA CTTTTTGGAT AATCTTTTAT ATTTCTGACC TTTGAATTTA AT CATTGTTC 2 94 0 
T TAG AT T AAA AT AAAAT AT G CTATTGAAAC TAAAAAAAAA AAAGAGGGGA GAAGAAAAAA 3000 
AAAAAGG 3007 



(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1229 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THYMNOT0 4 

(B) CLONE: 2652271 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 : 
CTCTCTGCTC CGGTGCAGGC CCGCAGGCGC CCTGGGCTGG GAGCAACGCG ACTGACCGTG 60 
GTCGTGGGCG GACGGCGGCT GCAGCGTGGA GGAGCTGGGG TCGCTGTGGG TCGCGAACAG 120 
AGCCCGGGAC GTGCGCGCTT GGTGCACGAT CCTGAAGGGG AGCTCCGAGG GGCCCGGGTC 18 0 
TCCAGGGCTG CTGCGGCCAT TCCCGGAGCC CGGCGCGGGG CCCGCGAGAT ACTGGTTTAG 24 0 
GCCGTCCCAG GGCTCCGGGC GCACCCGGTG GCCGCTGCTG CAGCGGAGGG AGCGCGGCGG 300 
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CGCGGGGGCT CGGAGACAGC GTTTCTCCCG 
GGGAGCCGGA GCGGCAGCTG GCAGCGTTCT 
CCCTGCGCGG GGCCCTGCTG GGCTGCCTCT 
ACAAGCGCCT GCGTGACAAC CATGAGTGGA 
AG AC AG TAT G CGAGAAAATT CAAAACGACT 
ATGGACTATG GCCCGATAAA AGTGAAGGAT 
AGAT TAAGGA TCTTTTGCCA GAAAT GAGGG 
CCAATCGCAG CCGCTTCTGG AAG CAT GAG T 
TGGATGCGCT CAACTCCCAG AAG AAG T AC T 
TGGACCTCAA CAGTGTGCTT CTAAAATTGG 
TTGCAGATTT TAAAGATGCC CTTGCCAGAG 
TTCCACCAAG CCAGGATGAG GAAGTACAGA 
AG C AAG AC C A GCAGCTGCAA AACTGCACCG 
AAGTCTGGCT GGCAAATGGG GCCGCCGAGA 
CAGTCTTCTA TCCCCCACCT AAAAAG AC C A 
TGTTTTAAAA AGCATGAGGT AGGCATGTC 




GAAGTCTTCC TCGGGCAGCA GGTGGGAAGT 360 
CTCCGCAGGT CGGCACCATG CGCCCTGCAG 42 0 
GCCTGGCGTT GCTTTGCCTG GGCGGTGCGG 4 80 
AAAAACTAAT TATGGTTCAG CACTGGCCTG 54 0 
GTAGAGACCC TCCGGATTAC TGGACAATAC 600 
GTAATAGATC GTGGCCCTTC AATTTAGAAG 660 
CATACTGGCC TGACGTAATT CACTCGTTTC 720 
GGGAAAAGCA TGGGACCTGC GCCGCCCAGG 7 80 
TTGGCAGAAG CCTGGAACTC TACAGGGAGC 840 
GGATAAAACC ATCCATCAAT T AC T AC CAAG 90 0 
TAT AT G GAG T GATACCCAAA ATCCAGTGCC 960 
CAATTGGTCA GATAGAACTG TGCCTCACTA 102 0 
AGCCGGGGGA GCAGCCGTCC CCCAAGCAGG 10 8 0 
GCCGGGGTCT GAGAGTCTGT GAAGATGGCC 114 0 
AGCATTGATG CCCAAGTTTT G GAAAT AT TC 1200 

1229 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1972 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUTll 

(B) CLONE: 2746976 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 : 
ACAGGGGCTT CCCCTTCGCC GCCGCCGCCG CCGCCGGCCA AGCTCCGCCG CGCCCGCGGC 60 
CCGCGGCCGC CATGCAGTTT ATGTTGCTTT TTAGTCGTCA GGGAAAGCTT CGACTGCAAA 120 
AATGGTATGT CCCACTATCA GACAAAGAGA AGAGAAAGAT CACAAGAGAA CTTGTTCAGA 18 0 
CCGTTTTAGC ACGGAAACCT AAAATGTGCA GCTTCCTTGA GTGGCGAGAT CTGAAGATTG 24 0 
TTTACAAAAG ATATGCTAGT CTGTATTTTT GCTGTGCTAT T GAG GAT C AG GACAATGAAC 300 
TAATTACCCT GGAAATAATT CATCGTTATG TGGAATTACT TG AC AAG TAT TTCGGCAGTG 360 
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TCTGTGAACT AGATATCATC TTTAATTTTG AGAAGGCTTA TTTTATTTTG GATGAGTTTC 420 
TTTTGGGAGG GGAAGTTCAG GAAACAT CCA AGAAAAATGT CCT TAAAGCA ATTGAGCAGG 4 80 
CTGATCTACT GCAGGAGGAT GCGAAAGAAG CTGAAACCCC ACGTAGTGTT CTTGAAGAAA 540 
TTGGACTGAC ATAACTCTCC TCCCTTGTTG ATGACTTCTT GTGGCATTTC ACACACTGTA 600 
GATGGTCACT CCCTTCATGT CCATGTTAGC TCATGGTGTA AGATGATGTC TTGTCAGTAT 660 
TACTGTTTTG CTAAGCCGCT TCATTCATGC CTACACAATT TTTTTTTAAA AGGGAACTTT 7 20 
AGTTAATTAA GTGATAAGGG ACTTAAATAT GAATTAGAAT GGTGCAGAAA GAGATACCTT 7 80 
TTCTGGATAT TTTAAAGTTT AAAGGTCAGT TTCTCTTAAT CTGATTATGT GCACATATGA 84 0 
AAATGGCACA T CAT AT AC AT GTAAAATCAG GCAGTATACA TTTATTAATT ACTGTATTTG 900 
ACAAAGGAAA CTCTTAAATT ATAATGTGAA ACCTGGTTTT ATGAAACCAA AGACTAGTGC 960 
AGCATTTCAG CAT AT GT AAA AAAAAAAAAA AAGGGAATTG AC AT G T C AC A TATCAAATGA 1020 
ATGGAAACTT TGTTGAAACT TTAAAAAGCA AATTTACTCC AAAGACTTGT ATTGGAAATT 1080 
ACATACCTTT TTTTTTTTTT TTTAAAGGAC T AC AG AT TAT TTTTAATGAC TAAATTGGAG 114 0 
TGATACTTCT T AC AC T AAAA ATTATTTCTT AGGCATTCTG AATCTGGGAT GAGAAACAGG 1200 
ATTGTTTCAC AATAGTAAGC ACATAATTTT TAAGGCCAAG GCACATTTGA CTCCTGAGAT 12 60 
GAATTTTTTG TGGTCATAAT CAAATACTTA GTTGTTTTTG ATGCCCCAAA ATAAAGTGAG 1320 
AATGGTAATT TGCCAGGAAT TCTTCATAAC AGTATCTTAC AAAAAACGTG TTGCTCTCTT 1380 
CACAGTATTA TGTGTAAAGT CATTGTTTAA AGCACGAATG TTCCCTCTGG GGTACTTGTT 14 4 0 
AAAGCTAAAT TTATTTTGCT TCCCTCCACT TAGAAGTGCT GCACACTTTA CAGCAGCTTC 1500 
CTTTCTTTCC ATGGCACTGC CTAGTTAACA GAAGTCTTAT AAAAATTTAA AAAGACACAT 15 60 
TTCTTACAAA AAAGAGTTGA ATGAGGTAAA ATGGCATTAG ATGGCTCTAT ATTTTTTAAA 1620 
GCTATGTAAT TGTTCAGCGT CACTTTTCTA AG TACT TATA CATATCTAAA CATGTCTTCA 1680 
TGGTTTATAT TTTCACTTAT ATATGCTGGG CTGGATTAAG CTTTGTTGTG ATTGTGACCA 17 4 0 
ACATTCAGGC CACGTGAGCA CTGTCTTATC ACATCGCCAA TTAGTTGTAA TAAACGTTCA 18 00 
ACGTACAAAC ACTGGAGTGT GTTTTTATCT CTTTCCAAAA GTTTGTCAAA CTATGCAGAG 18 60 
CTGCTGAAGG AAGAATTTCT CATTTTTTTT TCAGTAAAAT GTTGAAAATT CCCCTCCATT 1920 
TGAATATGGT GGTTGTTATA AG C AC AC AC A AGATACATGG TGGAAGATCT AG 1972 



(2) INFORMATION FOR SEQ ID NO: 138: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1741 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP1AZS08 

(B) CLONE: 2753496 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138 : 
CGGGTTCCGG GCTCCGGGCT CTGGGTGGCG GCGGCTGTGA GCNGCGGCTG ANCCNCCGCG 60 
CTGCGCANCG ACGCGGGAAT GAAGCGGGCG CTGGGCAGGC GAAAGGGCGT GTGGTTGCGC 120 
CTGAGGAAGA TACTTTTCTG TGTTTTGGGG TTGTACATTG CCATTCCATT TCTCATCAAA 18 0 
CTATGTCCTG GAATACAGGC CAAACTGATT TTCTTGAATT TCGTAAGAGT TCCCTATTTC 2 40 
ATTGATTTGA AAAAACCACA GGATCAAGGT TTGAAT CACA CGTGTAACTA CTACCTGCAG 300 
CCAGAGGAAG ACGTGACCAT TGGAGTCTGG CACACCGTCC CTGCAGTCTG GTGGAAGAAC 360 
GCCCAAGGCA AAG AC C AG AT GTGGTATGAG GATGCCTTGG CTTCCAGCCA CCCTATCATT 4 20 
CTGTACCTGC ATGGGAACGC AGGTACCAGA GGAGGCGACC ACCGCGTGGA GCTTTACAAG 4 80 
GTGCTGAGTT CCCTTGGTTA CCATGTGGTC ACCTTTGACT ACAGAGGTTG GGGTGACTCA 54 0 
GTGGGAACGC CATCTGAGCG GGGCATGACC TATGACGCAC TCCACGTTTT TGACTGGATC 600 
AAAGCAAGAA GTGGTGACAA CCCCGTGTAC ATCTGGGGCC ACTCTCTGGG CACTGGCGTG 660 
GCGACAAATC TGGTGCGGCG CCTCTGTGAG CGAGAGACGC CTCCAGATGC CCTTATATTG 72 0 
GAATCTCCAT TCACTAATAT CCGTGAAGAA GCTAAGAGCC ATCCATTTTC AG T GAT AT AT 7 80 
CGATACTTCC CTGGGTTTGA CTGGTTCTTC CTTGATCCTA TTACAAGTAG TGGAATTAAA 84 0 
TTTGCAAATG ATGAAAACGT GAAGCACATC TCCTGTCCCC TGCTCATCCT GCACGCTGAG 900 
GACGACCCGG TGGTGCCCTT CCAGCTTGGC AGAAAGCTCT ATAGCATCGC CGCACCAGCT 960 
CGAAGCTTCC GAGATTTCAA AGTTCAGTTT GTGCCCTTTC ATTCAGACCT TGGCTACAGG 1020 
CACAAATACA TTTACAAGAG CCCTGAGCTG CCACGGATAC TGAGGGAATT CCTGGGGAAG 1080 
TCGGAGCCTG AGCACCAGCA CTGAGCCTGG CCGTGGGAAG GAAGCATGAA GACCTCTGCC 114 0 
CTCCTCCCGT TTTCCTCCAG TCAGCAGCCC GGTATCCTGA AGCCCCGGGG GGCCGGCACC 1200 
TGCAATGCTC AGGAGCCCAG CTCGCACCTG GAGAGCACCT CAGATCCCAG GTGGGGAGGC 12 60 
CCCTGCAGGC CTGCAGTGCC CGGAGGCCTG AGCATGGCTG TGTGGAAAGC GTGGGTGGCA 1320 
GGCATGTGGC TCTCCTTGCC GCCCCTCAAC CTGAGATCTT GTTGGGAGAC TTAATGGCAG 1380 
CAGGCAGCCA TCACTGCCTG GTTGATGCTG CACTGAGCTG GACAGGGGGA GTCCGGGCAG 144 0 
GGGACTCTTG GGGCTCGGGA CCATGCTGAG CTTTTTGGCA CCACCCACAG AGAACGTGGG 1500 
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GTCCAGGTTC TTTCTGCACC TTCCCAGCAC 

CCCTCCTGCC CTGTGTACCT GCTTGCCTTT 

CACTCACCCA CAGTGGAAGT GCCCGGGATC 

ACCTAACCTG GCCTTAGACT GAGCTTTATT 
A 



ATGCAGAATG ACTCCAGTGG TTCCATCGTC 15 60 
CTCAGCTGCC CCACCTCCCC TGGGCTGGCC 1620 
TGCACTTCCT CCCCTTTCAC CTACCTGTAC 1680 
TAAGAATAAA ATCGTGGTGG TGAAAAAAAA 17 4 0 

1741 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2808 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: OVARTUT03 

(B) CLONE: 2781553 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139 : 
GGCAAGATGG CGGAAGGGGA GGACGTGGGA TGGTGGCGGA GCTGGCTGCA GCAGAGCTAC 60 
CAAGCAGTCA AAGAGAAGTC CTCTGAAGCC TTGGAGTTTA TGAAGCGGGA CCTGACGGAG 120 
TTTACCCAGG TGGTGCAGCA TGACACGGCC TGTACCATCG CAGCCACGGC CAGCGTGGTC 180 
AAGGAGAAGC TGGCTACGGA AGGCTCCTCA GGAGCAACAG AG AAGAT GAA GAAAGGGTTA 24 0 
TCTGACTTCC TAGGGGTGAT CTCAGACACC TTTGCCCCTT CGCCAGACAA AACCATCGAC 300 
TGCGATGTCA TCACCCTGAT GGGCACACCG TCTGGCACAG CTGAGCCCTA TGATGGCACC 3 60 
AAGGCTCGCC TCTATAGCCT GCAGTCGGAC CCAGCAACCT ACTGTAATGA ACCAGATGGG 4 20 
CCCCCGGAAT TGTTTGACGC CTGGCTTTCC CAGTTCTGCT TGGAGGAGAA GAAGGGGGAG 4 80 
ATCTCAGAGC TCCTTGTAGG CAGCCCCTCC ATCCGGGCCC TCTACACCAA GATGGTTCCA 54 0 
GCAGCTGTTT CCCATTCAGA ATTCTGGCAT CGGTATTTCT ATAAAGTCCA TCAGTTAGAG 600 
CAGGAGCAGG CCCGGAGGGA CGCCCTGAAG CAGCGGGCGG AACAGAGCAT CTCTGAAGAG 6 60 
CCCGGCTGGG AG GAGGAG GA AGAGGAGCTC ATGGGCATTT CACCCATATC TCCAAAAGAG 72 0 
GCAAAGGTTC CTGTGGCCAA AATTTCTACA TTCCCTGAAG GAGAACCTGG CCCCCAGAGC 780 
CCCTGTGAAG AGAATCTGGT GACTTCAGTT GAGCCCCCAG CAGAGGTGAC TCCATCAGAG 840 
AGCAGTGAGA GCATCTCCCT CGTGACACAG ATCGCCAACC CGGCCACTGC ACCTGAGGCA 900 
CGAGTGCTAC CCAAGGACCT GTCCCAAAAG CTGCTAGAGG CATCCTTGGA GGAACAGGGC 9 60 
CTGGCTGTGG ATGTGGGTGA GACTGGACCC TCACCCCCTA TTCACTCCAA GCCCCTAACG 1020 
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CCTGCTGGCC 


ACACCGGCGG 


CCCAGAGCCC 


AGGCCTCCAG 


CCAGAGTAGA 


GACTCTGAGG 


1080 


GAGGAGGCGC 


CCACAGACTT 


ACGGGTGTTT 


GAGCTGAACT 


CGGATAGTGG 


GAAGTCTACA 


1140 


CCCTCCAACA 


ATGGAAAGAA 


AGGCTCAAGC 


ACGGACATCA 


GTGAGGACTG 


GGAGAAAGAC 


1200 


TTTGACTTGG 


ACATGACTGA 


AGAGGAGGTG 


CAGATGGCAC 


TTTCCAAAGT 


GGATGCCTCC 


1260 


GGGGAGCTGG 


AAGATGTAGA 


GTGGGAGGAC 


TGGGAGTGAG 


GGAGCCAGAG 


GGAGCAGCTC 


1320 


CCCCACCCAT 


GGCATCTCTC 


GCCTCCCTCG 


CTCGTCTCAG 


CCCAGCCCTG 


GAAGACTGAG 


1380 


AATGTTCCCC 


CAAATCTCCT 


CTGCCAACCA 


GAGCTCTGGG 


CACAGATTCT 


GGTGGCTCCC 


1440 


TGCTGGCCCT 


CTTGGGCCTC 


TGCTCACACC 


TGGGAAGGGG 


CTCTCTAAAT 


CCCGGCCAGA 


1500 


AACTCTGACT 


TGTGCCAACA 


ATAGGATGAC 


CCAAGGGAGA 


GGAAACCTAT 


CCTCCTCACC 


1560 


AGAAGAGCCT 


GTGTTTTTCT 


GCTGAACACC 


CACTGTTCCT 


GAGGACTCCT 


GCTGGGAAGT 


1620 


CCCAAGGGAT 


AGTTCTAGCC 


CTTCTGCCTG 


TGTAGACAGA 


AGCTAAACCA 


CCAGTCTCTC 


1680 


TCGGAGGAAG 


CTGAGACAAC 


ATACTCTGTC 


CATACATAAG 


CAGGCAGGGA 


GGGCCATGCC 


1740 


ACCTACCCTT 


GGCTAAACAG 


GGACAGTGAA 


CACATTTTGG 


TTCCTATCCC 


AGTGGGTAAG 


1800 


AGGCACTTAT 


CTCTGGGAAA 


TTTGCCTCTC 


TTGGGACTCT 


CCCCCTCCCA 


GGCATTTTCC 


1860 


ATTCCTGGAA 


AGGCTCCTTT 


GGGGTTCAGA 


ATCCAGAGAC 


CAAACCCTGA 


CCCACCTCCT 


1920 


TCCTTTCCTC 


CAGCCCACGC 


TGGTCTGTCC 


CCATGCCTTC 


CCAGGGCTTC 


TTCATGTCAG 


1980 


ATGCACCCAA 


GTCCTTAGCC 


CAGCTGTGCC 


ACCTGCAGGA 


GTTCGCTCTT 


GCGTTTCTTC 


2040 


CCCTCCCCAA 


GAAGGGAGGG 


GGCTACTTCA 


GGCCCTTCTG 


TGTGTTGCCT 


GGCAGGATAC 


2100 


CTTGTCCAAC 


CAGCTACCCA 


CCTCAACTCC 


CCTGTAGTTT 


AGGACACAAA 


ACAGCTACCA 


2160 


GCGGTACAGA 


GCGGTGATCA 


AAGCCGAGTA 


CTTACAACTC 


TGGTAAGCCT 


AGCTTCTCCG 


2220 


CCTCAGCCCT 


TCTGCTTCTG 


GAAGGGCTAT 


CCTGGGGGTG 


AACTTGAAAC 


TCTCATCAGG 


2280 


CTTCTGCAAA 


AGCTCTTCTT 


CCTGAAGACA 


GACCCAGCCT 


TTGTGCTCTC 


ACCCTCCACT 


2340 


CTGGTAAAGC 


TGCACCTCTG 


GGGGAATGAG 


GGGCTGCAGG 


AATCTCTGGA 


GAGCCTGGTG 


2400 


CTTCACGATG 


CTGCTCTGGT 


GATTCTTGTA 


CCTAATCTGG 


TGTGCTCACC 


AATGAGTGAA 


2460 


AGGGATCGTG 


GGTCAGGGAC 


ACCGAGAGAG 


TGAGGTCACT 


TCCACTTCAA 


ACCTTCAGTG 


2520 


AGGGGGTGGG 


ATGGAGAGAA 


TGCTGAATCT 


TTTTTTTGAC 


GGGATGGGGT 


TTTTCTCTTT 


2580 


GTAATTATTT 


CTTTAGTTTA 


ATTAACCTTT 


TGGTTGTTTG 


TGCAAT AT T A 


TATATTTTAA 


2640 


ATTATAATGC 


ATCTCCCCAG 


AGTATTTTGT 


AGCTGGGAAA 


AGAAAAAAGG 


AAAAAAAGAA 


2700 


AAAAAGATTC 


TAACAGCTGT 


TAGTTTTATA 


ATTAAAAAAG 


AAAGAAAAAA 


GAACTTTGTC 


2760 


CTGAACCTTT 


TACAGACTTG 


CCGTTAACAG 


CATTAAAGTG 


ATTCACCC 




2808 
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(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{vii) IMMEDIATE SOURCE: 

(A) LIBRARY: ADRETUT0 6 

(B) CLONE: 2821925 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140 : 
CATGCGCCGA CCTTCCTCGG CTGGATTTAC ANGTTNNCCC TTAACACCCG GGATTTAAGG 60 
GACCCACACT ACCTTCCCGA AGTTGAAGGC AAGCGGTGAT TGTTTGTAGA CGGCGCTTTG 12 0 
TCATGGGACC TGTGCGGTTG GGAATATTGC TTTTCCTTTT TTTGGCCGTG CACGAGGCTT 180 
GGGCTGGGAT GTTGAAGGAG GAG G AC GAT G ACACAGAACG CTTGCCCAGC AAATGCGAAG 240 
TGTGTAAGCT GCTGAGCACA GAGCTACAGG CGGAACTGAG TCGCACCGGT CGATCTCGAG 300 
AGGTGCTGGA GCTGGGGCAG GTGCTGGATA CAGGCAAGAG GAAGAGACAC GTGCCTTACA 360 
GCGTTTCAGA GACAAGGCTG GAAGAGGCCT TAGAGAATTT ATGTGAGCGG ATCCTGGACT 4 20 
ATAGTGTTCA CGCTGAGCGC AAGGGCTCAC TGAGATATGC CAAGGGTCAG AG T C AG AC C A 4 80 
TGGCAACACT GAAAGGCCTA GTGCAGAAGG GGGTGAAGGT GGATCTGGGG ATCCCTCTGG 54 0 
AGCTTTTGGG ATGAGCCCAG CCGTTGAGGT CACATACCTC AAGAAGCAGT GTGAGACCAT 60 0 
GTTNGAGGAG TTTTGAGACA TTGTGGGAGA CTGGTACTTG CACCATCAGG AGCAGCCGCT 660 
ACAAGATTTT CTCTGTGAAG GTCATGTGCT GCCAGCTGCT TGAACTGCAT GTCGGGT 717 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2552 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: UTRSTUT05 

(B) CLONE: 2879068 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141 : 
GGCAGGGGGC GCGCCGGGCC CAGCGCCACG TCACCGCCCA GCAGCCCTCC CGATTGGCGG 60 
GCGGGGCGGC TATAAAGGGA GGGCGCAGGC GGCGCCCGGA TCTCTTCCGC CGCCATTTTA 120 
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AATCCAGCTC CATACAACGC TCCGCCGCCG CTGCTGCCGC GACCCGGACT GCGCGCCAGC 18 0 
ACCCCCCTGC CGACAGCTCC GTCACTATGG AGGATATGAA CGAGTACAGC AATATAGAGG 24 0 
AATTCGCAGA GGGATCCAAG ATCAACGCGA GCAAGAATCA GCAGGATGAC GGTAAAATGT 300 
TTATTGGAGG CTTGAGCTGG GATACAAGCA AAAAAGATCT GACAGAGTAC TTGTCTCGAT 360 
TTGGGGAAGT TGTAGACTGC ACAATTAAAA CAGATCCAGT CACTGGGAGA TCAAGAGGAT 420 
TTGGATTTGT GCTTTTCAAA GATGCTGCTA GTGTTGATAA GGTTTTGGAA CTGAAAGAAC 4 80 
ACAAACTGGA TGGCAAATTG ATAGATCCCA AAAGGGCCAA AGCTTTAAAA GGGAAAGAAC 54 0 
CTCCCAAAAA GGTTTTTGTG GGTGGATTGA GCCCGGATAC TTCTGAAGAA CAAATTAAAG 600 
AATATTTTGG AGCCTTTGGA GAGATTGAAA ATATTGAACT TCCCATGGAT ACAAAAACAA 660 
AT G AAAGAAG AGGATTTTGT TTTATCACAT ATACTGATGA AGAGCCAGTA AAAAAATTGT 720 
TAGAAAGCAG ATACCATCAA ATTGGTTCTG GGAAGTGTGA AATCAAAGTT GCACAACCCA 78 0 
AAGAGGTATA TAGGCAGCAA CAGCAACAAC AAAAAGGTGG AAGAGGTGCT GCAGCTGGTG 84 0 
GACGAGGTGG TACGAGGGGT CGTGGCCGAG GTCAGGGCCA AAACTGGAAC CAAGGATTTA 900 
ATAACTATTA TGATCAAGGA TATGGAAATT ACAATAGTGC CTATGGTGGT GATCAAAACT 960 
ATAGTGGCTA TGGCGGATAT GATTATACTG GGTATAACTA TGGGAACTAT GGATATGGAC 1020 
AGGGATATGC AGACTACAGT GGCCAACAGA GCACTTATGG CAAGGCATCT CGAGGGGGTG 1080 
GCAATCACCA AAACAATTAC CAGCCATACT AAAGGAGAAC AT T G GAG AAA ACAGGTGTGT 114 0 
ATAAGAGTAC AGGAAAACAG TAGAAATGTC TAATTTAATT TAAAGATCAA TAGACAAATG 1200 
AAACGTAAAA ACAAAAT AC T ATGTAGCCTG TTTTTACTAA ATTGTTGATT TTTTAATTGC 12 60 
TTTATGAGCC TGTTTTGCCT AAAGTGTCTA TAGATCTTTA ACTTTAAAGT CTTATCTCAC 1320 
TTTCTTTAGT ATTGCAGAAA AACTTAAGAG TTTTTCTGTT TGCTTTTGTG TACCAGGTGG 1380 
TCTAGAGGAA TAATTAAACA TTTTAGAACT ATTAACAGGT AAAGTACTGA AATGGGTACA 14 4 0 
ACTTAAGGAA AACAAGAATG TTGTCTTCTA ACTCTGACAT TATACCTTGT TTGTACCCGC 1500 
CAGCGGGAAC TTCATTGCAG GCCGTGTGTC ACCCTGACCA CGTCTATCTC TGGGGGTCGC 1560 
ACGTTGCGGG CAGAGCGCAA GGCATACACC AGAAAACGCT GTCCTGTGGT ATGGTCTCTT 162 0 
CCAACTTCAT GTACCAGCGT AAAGAT T AAA GTGGAAAACT TCAGACTTTG GCTTCATTTT 168 0 
TAATCTTTTT GGAGATTAAG TGTCTAAACT TAACTTAAAT GGTTTTTTAC AGGAGTTAAA 17 4 0 
GTACATAAAT GCCTTTTTAC AGCTTAATCA TTTTGGTCTT CTGTTTAGTG TTGTATTTCA 18 00 
ATTGTGGAGC CTCATTTTAA GTGTTCATTC TTTTAAGATT TAATGCTTGC TTTTTCTTTT 18 60 
TATAGCTAAT AGTGAAATCT ACAAACCAAA ACAAGAACTT TTAAATCTGG GATATAAATT 1920 
AAAGAT CAT A TGCACAGATC AATTTATGTT CTTGTAATAA ACTTATTAGA AATTGGTGTT 198 0 
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TGTGATAGCA 


TTTTACTTGG 


GTTACTAGAG 


ATGCTTCTAG 


TAGACCTTAA 


TCTAGCATAG 


2040 


TTGAACCTCT 


GAATATGGGA 


AGGTTGTATT 


CCCAGATTCT 


TTCCTGAATA 


GATTTGAATT 


2100 


TAATGTCATT 


TGGGAACTCC 


AGGGTGAGTT 


TATTGACTAC 


CCAAACTGTA 


TTTTACCAAT 


2160 


AAATATGCAT 


ATGATCTTTA 


ATTATTGAAG 


AAAATAAAGT 


GAGGACTTAA 


AACAATTCAT 


2220 


GAAAGTGGAC 


CTTTAAAAGC 


TTGTCAGAGT 


TGCACAAATC 


TAACTGGTAT 


TTTGTTTTTG 


2280 


TTTTTAGGAG 


GAGATGTTAA 


AGTAACCCAT 


CTTGCAGGAC 


GACATTGAAG 


ATTGGTCTTC 


2340 


TGTTGATCTA 


AGATGATTAT 


TTTGTAAAAG 


ACTTTCTAGT 


G T AC AAGAC A 


CCATTGTGTC 


2400 


CAACTGTATA 


TAGCTGCCAA 


TTAGTTTTCT 


TTGTTTTTAC 


TTTGTCCTTT 


GCTATCTGTG 


2460 


TTATGACTCA 


ATGTGGATTT 


GTTTATACAC 


ATTTTATTTG 


TAT CAT T T CA 


TGTTAAACCT 


2520 


CAAATAAATG 


CTTCCTTATG 


TGAAAAAAAA 


AA 






2552 




(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SINJNOT02 

(B) CLONE: 2886757 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142 : 



TACCAGTGTA 


AAGCCAGAGC 


TGAGGTTCTT 


GATAGTCCAC 


AATGGGTGAA 


CCACAGCAAG 


60 


TGAGTGCACT 


TCCACCACCT 


CCAATGCAAT 


ATATCAAGGA 


ATATACGGAT 


GAAAATATTC 


120 


AAGAAGGCTT 


AGCTCCCAAG 


CCTCCCCCTC 


CAATAAAAGA 


CAGTTACATG 


ATGTTTGGCA 


180 


ATCAGTTCCA 


AT GT GAT GAT 


CTTATCATCC 


GCCCTTTGGA 


AAGTCAGGGC 


ATCGAACGGC 


240 


TTCATCCTAT 


GCAGTTTGAT 


CACAAGAAAG 


AACTGAGAAA 


ACTTAATATG 


TCTATCCTTA 


300 


TTAATTTCTT 


GGACCTTTTA 


GATATTTTAA 


TAAGGAGCCC 


TGGGAGTATA 


AAACGAGAAG 


360 


AGAAACTAGA 


AGATCTTAAG 


CTGCTTTTTG 


TACACGTGCA 


TCATCTTATA 


AATGAATACC 


420 


GACCCCACCA 


AGCAAGAGAG 


ACCTTGAGAG 


TCAT GATGGA 


GGTCCAGAAA 


CGTCAACGGC 


480 


TTGAAACAGC 


TGAGAGATTT 


CAAAAGCACC 


TGGAACGAGT 


AATTGAAATG 


ATTCAGAATT 


540 


GCTTGGCTTC 


TTTGCCTGAT 


GATTTGCCTC 


ATTCAGAAGC 


AGGAATGAGA 


GTAAAAACTG 


600 


AACCAATGGA 


TGCTGATGAT 


AGCAACAATT 


GTACTGGACA 


GAAT GAAC AT 


CAAAGAGAAA 


660 


ATTCAGGTCA 


TAGGAGAGAT 


CAGATTATAG 


AGAAAGATGC 


TGCCTTGTGT 


GTCCTAATTG 


720 
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ATGAGATGAA TGAAAGACCA TGAAAGATGT 
CATCATATAT TAGTTCATTT TCTTTTGGAC 
AACAGCTTTA ATCTTGACTC CAAATTTTTC 
CTGTATAACA AGCATAGACA AATGAGTGTC 
AAAGTGTTAG CTGCTGTTGT ATGGGACATT 
ATGATAACCT CAAAAAAAAA TAAAAA 




TTCTTTTTCT TTTTTTCCTT TTGATAATAG 78 0 
AGTCTTAAGA GAAGTTTCAC TAAAAATGTA 8 40 
AAT TAT GAGA TGTCATAGGC AGTAATTTCG 90 0 
CCTGCACTAA GAAGAATCAC TTTAAAAAGC 960 
CCTATGTTTT AGAGTTGCAG TAAAACTTTG 102 0 

1046 



(2) INFORMATION FOR SEQ ID NO: 14 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1864 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SCORNOT04 

(B) CLONE: 2964329 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143 : 
GCCCTGGGCT CGCGGCGGTG CCGCGGGGAT GGCGGGAGCC GGAGCTGGAG CCGGAGCTCG 60 
CGGCGGAGCG GCGGCGGGGG TCGAGGCTCG AGCTCGCGAT CCACCGCCCG CGCACCGCGC 120 
ACATCCTCGC CACCCTCGGC CTGCGGCTCA GCCCTCGGCC CGCAGGATGG ATGGCGGGTC 18 0 
AGGGGGCCTG GGGTCTGGGG ACAACGCCCC GACCACTGAG GCTCTTTTCG TGGCACTGGG 24 0 
CGCGGGCGTG ACGGCGCTCA GCCATCCCCT GCTCTACGTG AAGCTGCTCA TCCAGGTGGG 300 
TCATGAGCCG ATGCCCCCCA CCCTTGGGAC CAATGTGCTG GGGAGGAAGG TCCTCTATCT 360 
GCCGAGCTTC TTCACCTACG CCAAGTACAT CGTGCAAGTG GATGGTAAGA TAGGGCTGTT 420 
CCGAGGCCTG AGTCCCCGGC TGATGTCCAA CGCCCTCTCT ACTGTGACTC GGGGTAGCAT 4 80 
GAAGAAGGTT TTCCCTCCAG ATGAGATTGA GCAGGTTTCC AACAAGGATG ATATGAAGAC 54 0 
TTCCCTGAAG AAAGTTGTGA AGGAGACCTC CTACGAGATG ATGATGCAGT GTGTGTCCCG 600 
CATGTTGGCC CACCCCCTGC ATGTCATCTC AATGCGCTGC ATGGTCCAGT TTGTGGGACG 660 
GGAGGCCAAG TACAGTGGTG TGCTGAGCTC CATTGGGAAG ATTTTCAAAG AGGAAGGGCT 7 2a 
GCTGGGATTC TTCGTTGGAT TAATCCCTCA CCTCCTGGGC GATGTGGTTT TCTTGTGGGG 780 
CTGTAACCTG CTGGCCCACT TCATCAATGC CTACCTGGTG GATGACAGCT TCAGCCAGGC 84 0 
CCTGGCCATC CGGAGCTATA CCAAGTTCGT GATGGGGATT GCAGTGAGCA TGCTGACCTA 900 
CCCCTTCCTG CTAGTTGGCG ACCTCATGGC TGTGAACAAC TGCGGGCTGC AAGCTGGGCT 960 
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CCCCCCTTAC TCCCCAGTGT TCAAATCCTG GATTCACTGC TGGAAGTACC TGAGTGTGCA 1020 
GGGCCAGCTC TTCCGAGGCT CCAGCCTGCT TTTCCGCCGG GTGTCATCAG GATCGTGCTT 1080 
TGCCCTGGAG TAACCTGAAT CATCTAAAAA ACACGGTCTC AACCTGGCCA CCGTGGGTGA 114 0 
GGCCTGACCA CCTTGGGACA CCTGCGAGAC GACTCCAACC CAACAACAAC CAGATGTGCT 1200 
CCAGCCCAGC CGGGCTTCAG TTCCATATTT GCCATGTGTC TGTCCAGATG TGGGGTTGAG 1260 
CGGGGGTGGG GCTGCACCCA GTGGATTGGG TCACCCGGCA GACCTAGGGA AGGTGAGGCG 132 0 
AGGTGGGGAG TTGGCAGAAT CCCCATACCT CGCAGATTTG CTGAGTCTGT CTTGTGCAGA 138 0 
GGGCCAGAGA ATGGCTTATG GGGGCCCAGG TTGGATGGGG AAAGGCTAAT GGGGTCAGAC 144 0 
CCCACCCCGT CTACCCCTCC AGTCAGCCCA GCGCCCATCC TGCAGCTCAG CTGGGAGCAT 1500 
CATTCTCCTG CTTTGTACAT AGGGTGTGGT CCCCTGGCAC GTGGCCACCA TCATGTCTAG 1560 
GCCTATGCTA GGAGGCAAAT GGCCAGGCTC TGCCTGTGTT TTTCTCAACA CTACTTTTCT 1620 
GATATGAGGG CAGCACCTGC CTCTGAATGG GAAATCATGC AACTACTCAG AATGTGTCCT 168 0 
CCTCATCTAA TGCTCATCTG TTTAATGGTG ATGCCTCGCG TACAGGATCT GGTTACCTGT 174 0 
GCAGTTGTGA ATACCCAGAG GTTGGGCAGA TCAGTGTCTC TAGTCCTACC CAGTTTTAAA 1800 
GTTCATGGTA AGATTTGACC TCATCTCCCG CAAATAAATG TATTGGTGAT TTGGAAAAAA 18 60 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: SCORNOT04 

(B) CLONE: 2965248 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144 : 
GTCTGCAGCT CCGGCCGCCA CTTGCGCCTC TCCAGCCTCC GCAGGCCCAA CCGCCGCCAG 60 
CACCATGGCC AGCACCATTT CCGCCTACAA GGAGAAGATG AAGGAGCTGT CGGTGCTGTC 120 
GCTCATCTGC TCCTGCTTCT ACACACAGCC GCACCCCAAT ACCGTCTACC AGTACGGGGA 180 
CATGGAGGTG AAGCAGCTGG ACAAGCGGGC CTCAGGCCAG AGCTTCGAGG TCATCCTCAA 24 0 
GTCCCCTTCT GACCTGTCCC CAGAGAGCCC TATGCTCTCC TCCCCACCCA AGAAGAAGGA 300 
CACCTCCCTG GAGGAGCTGC AAAAGCGGCT GGAGGCAGCC GAGGAGCGGA GGAAGACGCA 360 
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GGAGGCGCAG GTGCTGAAGC AGCTGGCGGA CGGCGCGAGC ACGAGCGCGA GGTGCTGCAC 42 0 
AAGGCGCTGG AGGAGAATAA CAACTTCAGC CGCCAGGCGG AGGAGAAGCT CAACTACAAG 4 80 
ATGGAGCTCA GCAAGGAGAT CCGCGAGGCA CACCTGGCCG CACTGCGCGA GCGGCTGCGC 54 0 
GAGAAGGAGC TGCACGCGGC CGAGGTGCGC AGGAACAAGG AGCAGCGAGA AGAGATGTCG 600 
GGCTAAGGGC CCGGGACGGG CGGCGCCCAT CCTGCGACGG AACACGTTCG GGTTTTGGTT 660 
TTGTTTCGTT CACCTCTGTC TAGATGCAAC TTTTGTTCCT CCTCCCCCAC CCCAGCCCCC 72 0 
AGCTTCATGC TTCTCTTCCG CACTCAGCCG CCCTGCCCTG TCCTCGTGGT GAGTCGCTGA 7 80 
CCACGGCTTC CCCTGCAGGA GCCGCCGGGC GTGAGACGCG GTCCCTCGGT GCAGACACCA 84 0 
GGCCGGGCGC GGCTGGGTCC CCCGGGGGCC CTGTGAGAGA GGTGGTGGTG ACCGTGGTAA 900 
ACCCAGGGCG GTGGCGTGGG ATCGCGGGTC CTTACGCTGG GCTGTCTGGT CAGCACGTGC 960 
AGGTCAGGGC AGGTCCTCTG AGCCGGCGCC CCTGGCCAGC AGGCGAGGCT ACAGTACCTG 1020 
CTGTCTTTCC AGGGGGAAGG GGCTCCCCAT GAGGGAGGGG CGACGGGGGA GGGGGGTGAT 1080 
GGTGCCTGGG AGCCTGCGTG TGCAGCCGGT GCTTGTTGAA CTGGCAGGCG GGTGGGTGGG 1140 
GGCTGCAGCT TTCCTTAATG TGGTTGCACA GGGGTCCTCT GAGACCACCT GGCGTGAGGT 1200 
GGACACCCTG GGCCTTCCTG GAAGCCTGCA GTTGGGGGCC TGCCCTGAGT CTGCTGGGGA 12 60 
GTGGGCATTC TCTGCCAGGG AC C CAT GAG C AGGCTGCATG GTCTAGAGGT TGTGGGCAGC 1320 
ATGGACAGTC CCCCACTCAG AAGTGCAAGA GTTCCAAAGA GCCTCTGGCC CAGGCCCCTC 1380 
CGTGGGACAG CCCCGCCGCC CCTCCCCACC AGGGCTTTGC AGATGTCCTT GAAAGACCCA 14 4 0 
CCCTAGAGCC CTTTGGAGTG CTGGCCCCTC CTGTGCCCTC TGCCCTGGTG GAAGCGGCAG 1500 
CCACAAGTCC TCCTCAGGGA GCCCCAAGGG GGATTTTGTG GGACCGCTGC CCACAGATCC 15 60 
AGGTGTTGGA AGGGCAGCGG GTAAGGTTCC CAAGCCAGCC CCAACACCCT TCCCACTTGG 1620 
CACCCAGAGG GGGCTGTGGG TGGAGGCCTG ACTCCAGGCC TCTCCTGCCC ACACCCTCTG 1680 
GGCTGAGTTC CTTCTTTCCC TTGGACGCCC AGTGCTGGCC TTGGAGGACG GTCAGCTGGA 17 4 0 
GGATGGCGGT GGGGGAGGCT GTCTTTGTAC C AC T G C AG C A TCCCCCACTT CTCCACGGAA 1800 
GCCCCATCCC AAAGCTGCTG CCTGGCCCCT TGCTGTAAAG TGTGAAGGGG GCGGCTGAGT 18 60 
TCTCTTAGGA CCCAGAGCCA GGGCCCTCAA CTTCCATCCT GCGGGAGGCC TTGGCCGGGC 1920 
ACTGCCAGTG TCTTCCAGAG CCACACCCAG GGACCACGGG AGGATCCTGA CCCCTGCAGG 198 0 
GCTCAGGGGT CAGCAGGGAC CCACTGCCCC ATCTCCCTCT CCCCACCAAG ACAGCCCCAG 204 0 
AAGGAGCAGC CAGCTGGGAT GGGAACCCAA GGCTGTCCAC ATCTGGCTTT TGTGGGACTC 2100 
AGAAAGGGAA GCAGAACTGA GGGCTGGGAT ATTCCTCATG GTGGCAGCGC TCATAGCGAA 2160 
AGCCTACTGT AATATGCACC CATCTCATCC ACGTAGTAAA GTGAACTTAA AAATTCAATC 2220 
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AAATGAACAA TTAAATAAAC ACCTGTGTGT T T AAG AC AAA ATAAAAATGG AGGAGAACAA 228 0 
AAAAAAAGGG GCGGT 2295 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: TLYMNOT0 6 

(B) CLONE: 3000534 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145 : 



CGGGGACGGA 


AGCAGCCCCT 


GGGCCCGAGG 


GGCTCGAGGC 


CGGGCCGGGG 


CGATGTGGAG 


60 


CGCGGGCCGC 


GGCGGGGCTG 


CCTGGCCGGT 


GCTGTTGGGG 


CTGCTGCTGG 


CGCTGTTAGT 


120 


GCCGGGCGGT 


GGTGCCGCCA 


AGACCGGTGC 


GGAGCTCGTG 


ACCTGCGGGT 


CGGTGCTGAA 


180 


GCTGCTCAAT 


ACGCACCACC 


GCGTGCGGCT 


GCACTCGCAC 


G AC AT C AAAT 


ACGGATCCGG 


240 


CAGCGGCCAG 


CAATCGGTGA 


CCGGCGTAGA 


GGCGTCGGAC 


GACGCCAATA 


GCTACTGGCG 


300 


GATCCGCGGC 


GGCTCGGAGG 


GCGGGTGCCC 


GCGCGGGTCC 


CCGGTGCGCT 


GCGGGCAGGC 


360 


GGTGAGGCTC 


ACGCATGTGC 


TTACGGGCAA 


GAACCTGCAC 


ACGCACCACT 


TCCCGTCGCC 


420 


GCTGTCCAAC 


AACCAGGAGG 


TGAGTGCCTT 


TGGGGAAGAC 


GGCGAGGGCG 


ACGACCTGGA 


480 


CCTATGGACA 


GTGCGCTGCT 


CTGGACAGCA 


CTGGGAGCGT 


GAGGCTGCTG 


TGCGCTTCCA 


540 


GCATGTGGGC 


ACCTCTGTGT 


TCCTGTCAGT 


CACGGGTGAG 


CAGTATGGAA 


GCCCCATCCG 


600 


TGGGCAGCAT 


GAGGTCCACG 


GCATGCCCAG 


TGCCAACACG 


CACAATACGT 


GGAAGGCCAT 


660 


GGAAGGC AT C 


T T CAT CAAGC 


CTAGTGTGGA 


GCCCTCTGCA 


GGTCACGATG 


AACTCTGAGT 


720 


GTGTGGATGG 


ATGGGTGGAT 


GGAGGGTGGC 


AGGTGGGGCG 


TCTGCAGGGC 


CACTCTTGGC 


780 


AGAGACTTTG 


GGTTTGTAGG 


GGTCCTCAAG 


TGCCTTTGTG 


AT T AAAGAAT 


GTTGGTCTAA 


840 



AA 842 



(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2345 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAANOT01 

(B) CLONE: 3046870 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146 : 
GTCCCGCCCC GCAGCTGCGC GCAGGCGCTC GACGAGCCGC TCGCATTCTA CGTAACGGAC 60 
GGCGGAGGCT ACGTGAAGAG AGGCGCGGCG TGACTGAGCT ACGGTTCTGG CTGCGTCCTA 120 
GAGGCATCCG G GG C AG T AAA ACCGCTGCGA TCGCGGAGGC GGCGGCCAGG CCGAGAGGCA 18 0 
GGCCGGGCAG GGGTGTCGGA CGCAGGGCGC TGGGCCGGGT TTCGGCTTCG GCCACAGCTT 24 0 
TTTTTCTCAA GGTGCAATGA AAGCCTTCCA CACTTTCTGT GTTGTCCTTC TGGTGTTTGG 300 
GAGTGTCTCT GAAGCCAAGT TTGATGATTT TGAGGATGAG GAGGACATAG TAGAGTATGA 360 
TGATAATGAC TTCGCTGAAT TTGAGGATGT CATGGAAGAC TCTGTTACTG AATCTCCTCA 420 
ACGGGTCATA ATCACTGAAG AT GAT GAAGA TGAGACCACT GTGGAGTTGG AAGGGCAGGA 480 
TGAAAACCAA GAAGGAGATT TTGAAGATGC AGATACCCAG GAGGGAGATA CTGAGAGTGA 54 0 
ACCATATGAT GATGAAGAAT TTGAAGGTTA T GAAGAC AAA CCAGATACTT CTTCTAGCAA 600 
AAATAAAGAC CCAATAACGA TTGTTGATGT TCCTGCACAC CTCCAGAACA GCTGGGAGAG 660 
TTAT TATCTA GAAATTTTGA TGGTGACTGG TCTGCTTGCT TAT AT CAT GA ATTACATCAT 720 
T GG GAAGAAT AAAAAC AG T C GCCTTGCACA GGCCTGGTTT AACACTCATA GGGAGCTTTT 780 
GGAGAGCAAC T T TACT T TAG TGGGGGATGA TGGAACTAAC AAAGAAGCCA CAAGCACAGG 84 0 
AAAGTTGAAC CAGGAGAATG AGCACATCTA TAACCTGTGG TGTTCTGGTC GAGTGTGCTG 90 0 
TGAGGGCATG CTTATCCAGC TGAGGTTCCT CAAGAGACAA GACTTACTGA ATGTCCTGGC 9 60 
CCGGATGATG AGGCCAGTGA GTGATCAAGT GCAAATAAAA GTAACCATGA AT GAT GAAGA 102 0 
CAT GG AT AC C TACGTATTTG CTGTTGGCAC ACGGAAAGCC TTGGTGCGAC TACAGAAAGA 1080 
GAT GC AG GAT TTGAGTGAGT TTTGTAGTGA TAAACCTAAG TCTGGAGCAA AGTATGGACT 114 0 
GCCGGACTCT TTGGCCATCC TGTCAGAGAT GGGAGAAGTC ACAGACGGAA TGATGGATAC 12 00 
AAAGATGGTT CACTTTCTTA CACACTATGC TGACAAGATT GAATCTGTTC ATTTTTCAGA 12 60 
CCAGTTCTCT GGTCCAAAAA TTATGCAAGA GGAAGGTCAG CCTTTAAAGC TACCTGACAC 1320 
TAAGAGGACA CTGTTGTTTA CATTTAATGT GCCTGGCTCA GGTAACACTT ACCCAAAGGA 138 0 
TATGGAGGCA CTGCTACCCC TGATGAACAT GGT GAT TTAT TCTATTGATA AAGCCAAAAA 14 40 
GTTCCGACTC AACAGAGAAG GCAAACAAAA AGCAGATAAG AACCGTGCCC GAGTAGAAGA 1500 
GAACTTCTTG AAACTGACAC ATGTGCAAAG ACAGGAAGCA GCACAGTCTC GGCGGGAGGA 1560 
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GAAAAAAAGA GCAGAGAAGG AGCGAATCAT 
GCTGGAGGAG GCTGCATTGA GGCGTGAGCA 
GAAACAAATC AAAGTGAAAG C CAT G T AAAG 
CTGTAAGCTC TGAATTCACA GGAAACATGA 
TTCAGACAGT CTTGGGCAAC TGAGAAATCC 
GGGGTTTTAC AGAGATTGAA GATACCTGGA 
AGATAATCAA ATTATTTTGA T TAT TT TATA 
TTTAAATATT TTAAAAATTA TAATACAAAT 
AGAAATACCA TGAAATTTAT AGGTAGATAA 
AGTTGAAATG GCTATAAAGA CTGACTCTAA 
GCACAATAAA CATTGCTTGA TGTTTTCTTG 
AAATTAGAAC ACTGTATGTA GTAATGAAAT 
GTTTTTAGGT GGGAGATGCT GATAACAAAA 
TGACA 




GAAT GAGGAA GATCCTGAGA AACAGCGCAG 1620 
AAAGAAGTTG GAAAAGAAGC AAATGAAAAT 1680 
CCATCCCAGA GATTTGAGTT CTGATGCCAC 17 4 0 
AAAACGCCAG TCCATTTCTC AACCTTAAAT 18 00 
TTATTTCATC ATCTACTCTG TTTGGGGTTT 18 60 
AAGGGCTCTG TTTCAAGAAT TTTTTTTTCC 1920 
AAAGGAATGA TCTATGAAAT CTGTGTAGGT 1980 
CATCAGTGCT TTTAGTACTT CAGTGTTTAA 204 0 
CCAGATTGTT GCTTTTTGTT TAAACCAAGC 2100 
ACCAAGATTC TGCAAATAAT GAT TG GAAT T 2160 
TATGTCTACA TTAAACTTGA GAAAAAGTAA 2220 
TTCAGGGACC CAGAACATAA TGTAGTATAT 2280 
TTAATAGGAA GTCTGTAGGC AT TAG GAT AC 234 0 

2345 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: PONSAZT01 

(B) CLONE: 3057669 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147 : 
CCCACGCGTC CGCCCACGCG TCCGTTTTCA GTAGGGATTT CCTGTGACCA GACAAGTTCA 60 
TCTGAGAGCC AGTTCTCACC ACTGGAATTC TCAGGAATGG ACCATGAGGA CATCAGTGAG 12 0 
TCAGTGGATG CAGCATACAA CCTCCAGGAC AGTTGCCTTA CAGACTGTGA TGTGGAAGAT 180 
GGGACTATGG ATGGCAATGA TGAGGGGCAC TCCTTTGAAC TTTGTCCTTC TGAAGCTTCT 24 0 
CCTTATGTAA GGTCAAGGGA GAGAACCTCC TCTTCAATAG TATTTGAAGA TTCTGGCTGT 300 
GATAATGCTT CCAGTAAAGA AGAGCCGAAA ACTAATCGAT TGCATATTGG CAACCATTGT 360 
GCTAATAAAC TAACTGCTTT CAAGCCCACC AGTAGCAAAT CTTCTTCTGA AGCTACATTG 420 
TCTATTTCTC CTCCAAGACC AACCACTTTA AGTTTAGATC TCACTAAAAA C AC C AC AG AA 4 80 
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AAACTCCAGC CCAGTTCACC AAAGGTGTAT CTTTACATTC AAATGCAGCT GTGCAGAAAA 54 0 
GAAAACCTCA AAGACTGGAT GAAT GGACGA TGTACCATAG AGGAGAGAGA GAGGAGCGTG 600 
TGTCTGCACA TCTTCCTGCA GATCGCAGAG GCAGTGGAGT TTCTTCACAG TAAAGGACTG 660 
ATGCACAGGG ACCTCAAGCC AT CC AAC AT A TTCTTTACAA TGGATGATGT GGTCAAGGTT 720 
GGAGACTTTG GGTTAGTGAC TGCAATGGAC CAGGATGAGG AAGAGCAGAC GGTTCTGACC 7 80 
CCAATGCCAG CTTATGCCAG ACACACAGGA CAAGTAGGGA CCAAACTGTA TATGAGCCCA 840 
GAGCAGATTC AT GGAAACAG CTATTCTCAT AAAGTGGACA TCTTTTCTTT AGGCCTGATT 900 
CTATTTGAAT TGCTGTATCC ATTCAGCACT CAGATGGAGA GAGTCAGGAC CTTAACTGAT 960 
GTAAGAAATC TCAAATTTCC AC CAT TAT TT AC T C AG AAAT ATCCTTGTGA GTACGTGATG 1020 
GTTCAAGACA TGCTCTCTCC ATCCCCCATG GAACGACCTG AAGCTATAAA CATCATTGAA 1080 
AATGCTGTAT TTGAGGACTT GGACTTTCCA GGAAAAACAG TGCTCAGACA GAGGTCTCGC 114 0 
TCCTTGAGTT CATCGGGAAC AAAAC AT T C A AGACAGTCCA ACAACTCCCA TAGCCCTTTG 12 00 
CCAAGCAATT AGCCTTAAGT TGTGCTAGCA ACCCTAATAG GTGATGCAGA TAATAGCCTA 12 60 
CTTCTTAGAA TATGCCTGTC CAAAATTGCA GACTTGAAAA GTTTGTTCTT CGCTCAATTT 1320 
TTTTGTGGAC TACTTTTTTT AT AT C AAAT T TAAGCTGGAT TTGGGGGCAT AACCTAATTT 1380 
GAGCCAACTC CTGAGTTTTG C TAT AC T T AA GGAAAGGGCT ATCTTTGTTC TTTGTTAGTC 14 4 0 
TCTTGAAACT GGCTGCTGGC CAAGCTTTAT AGCCCTCACC ATTTGCCTAA GGAGGTAGCA 1500 
GCAATCCCTA ATATATATAT AT AG T G AG AA CTAAAATGGA TATATTTTTA TAATGCAGAA 15 60 
GAAGGAAAGT CCCCCTGTGT GGTAACTGTA TTGTTCTAGA AATATGCTTT C TAG AG AT AT 162 0 
GATGATTTTG AAACTGATTT CTAGAAAAAG CTGACTCCAT TTTTGTCCCT GGCGGGTAAA 1680 
T TAG GAAT CT GCACTATTTT GGAGGACAAG TAGCACAAAC TGTATAACGG TTTATGTCCG 174 0 
TAGTTTTATA GTCCTATTTG TAGCATTCAA TAGCTTTATT CCTTAGATGG TTCTAGGGTG 1800 
GGTTTACAGC TTTTTGTACT TTTACCTCCA ATAAAGGGAA AATGAAGCTT TTTATGTAAA 1860 
TTGGTTGAAA GGTCTAGTTT TGGGAGGAAA AAAGCCGTAG TAAGAAATGG ATCATATATA 192 0 
TTACAACTAA CTTCTTCAAC TATGGACTTT TTAAGCCTAA TGAAATCTTA AGTGTCTTAT 198 0 
ATGTAATCCT GTAGGTTGGT ACTTCCCCCA AACTGATTAT AGGTAACAGT TTAATCATCT 204 0 
CACTTGCTAA CATGTTTTTA TTTTTCACTG T AAAT AT G T T TATGTTTTAT T TAT AAAAAT 2100 
TCTGAAATCA ATCCATTTGG GTTGGTGGTG TACAGAACAC ACTTAAGTGT GTTAACTTGT 2160 
GACTTCTTTC AAGTCTAAAT GATTTAATAA AACTTTTTTT AAATTAAAAA AAAAA 2215 
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(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1395 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAONOT03 

(B) CLONE: 3088178 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148 : 

GGTTGACATG ATGAACAATC GGTTTCGGAA GGATATGATG AAAAATGCTA GTGAAAGTAA 60 

ACTTTCGAAA GACAACCTTA AAAAG AG AC T TAAAGAAGAA TTCCAACATG CCATGGGAGG 120 

AGTACCTGCC TGGGCAGAGA CTACTAAGCG GAAAACATCT TCAGATGATG AAAGTGAAGA 180 

GGATGAAGAT GATTTGTTGC AAAGGACTGG GAAT T T CAT A TCCACATCAA CTTCTCTTCC 24 0 

AAGAGGCATC TTGAAGATGA AGAACTGCCA GCATGCGAAT GCTGAACGTC CTACTGTTGC 300 

TCGGATCTCA TCTGTGCAGT TCCATCCCGG TGCACAGATT GTGATGGTTG CTGGATTAGA 360 

TAATGCTGTA TCACTATTTC AGGTTGATGG GAAAACAAAT CCTAAAATTC AGAGCATCTA 4 20 

TTTGGAAAGG TTTCCAATCT TTAAGGCTTG TTTTAGTGCT AAT GGGGAAG AAGTTTTAGC 4 80 

CACGAGTACC CACAGCAAGG TTCTTTATGT CTATGACATG CTGGCTGGAA AGTTAATTCC 54 0 

TGTGCATCAA GTGAGAGGTT TGAAAGAGAA GATAGTGAGG AGCTTTGAAG TCTCCCCAGA 600 

TGGGTCCTTC TTGCTCATAA ATGGCATTGC TGGATATTTG CATTTGCTAG C AAT GAAGAC 660 

CAAAGAACTG ATTGGAAGCA TGAAAATTAA TGGAAGGGTT GCAGCATCCA CATTCTCTTC 720 

AG AT AG T AAG AAAGTATACG CCTCTTCGGG GGATGGAGAA GTTTATGTTT GGGATGT GAA 7 80 

CTCAAGGAAG TGCCTTAACA GATTTGTTGA TGAAGGCAGT TTATATGGAT TAAGCATTGC 840 

CACATCTAGG AATGGACAGT ATGTTGCTTG TGGTTCTAAT TGTGGAGTGG TAAATATATA 900 

CAATCAAGAT TCTTGTCTCC AAGAAACAAA CCCAAAGCCA ATAAAAGCTA TAATGAACTT 960 

GGTTACAGGT GTTACTTCTC TGACCTTCAA TCCTACTACA GAAATCTTGG CAATTGCTTC 1020 

AGAAAAAATG AAAGAAGCAG TCAGATTGGT TCATCTTCCT TCCTGTACAG TATTTTCAAA 1080 

CTTCCCAGTC ATTAAAAATA AGAATATTTC TCATGTTCAT AC CAT G GAT T TTTCTCCGAG 114 0 

AAGTGGATAC TTTGCCTTGG GGAATGAAAA GGGCAAGGCC CTGATGTATA GGTTGCACCA 1200 

TTACTCAGAC TTCTAAAGAG AC TAT TT GAA GTCCAGTTGA GTCACAAGAG AAGCCTGTCT 12 60 

T GAT AT AT C A TCTCAGAAAC TTTCCTGAAT AT G T GAT AAT ATATGGAAAA TGATTTATAG 1320 

ATCCAGCTGT GCTTAAGAGC CAGTAATGTC TTAATAAACA TGTGGCAGCT TTTGTTTGAA 1380 
AAAAAAAAAA AAAGG 1395 
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(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2609 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: BRSTNOT19 

(B) CLONE: 3094321 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149 : 
CCCGCCATGG CACTGTCGCG GGGGCTGCCC CGGGAGCTGG CTGAGGCGGT GGCCGGGGGC 60 
CGGGTGCTGG TGGTGGGGGC GGGCGGCATC GGCTGCGAGC TCCTCAAGAA TCTCGTGCTC 120 
ACCGGTTTCT CCCACATCGA CCTGATTGAT CTGGATACTA TTGATGTAAG CAACCTCAAC 180 
AGACAGTTTT TGTTTCAAAA GAAACATGTT GGAAGATCAA AGGCACAGGT TGCCAAGGAA 240 
AGTGTACTGC AGTTTTACCC GAAAGCTAAT ATCGTTGCCT ACCATGACAG CATCATGAAC 300 
CCTGACTATA ATGTGGAATT TTTCCGACAG TTTATACTGG TTATGAATGC TTTAGATAAC 360 
AGAGCTGCCC GAAACCATGT TAATAGAATG TGCCTGGCAG CTGATGTTCC TCTTATTGAA 4 20 
AGTGGAACAG CTGGGTATCT TGGACAAGTA AC T AC TAT C A AAAAGGGTGT GACCGAGTGT 4 80 
TATGAGTGTC ATCCTAAGCC GACCCAGAGA ACCTTTCCTG GCTGTACAAT TCGTAACACA 54 0 
CCTTCAGAAC CTATACATTG CATCGTTTGG GCAAAGTACT TGTTCAACCA GTTGTTTGGG 600 
GAAGAAGATG CTGATCAAGA AGTATCTCCT GACAGAGCTG ACCCTGAAGC TGCCTGGGAA 660 
CCAACGGAAG CCGAAGCCAG AGCTAGAGCA TCTAATGAAG ATGGTGACAT TAAACGTATT 720 
TCTACTAAGG AATGGGCTAA ATCAACTGGA TAT GAT CC AG TTAAACTTTT TACCAAGCTT 7 80 
TTTAAAGATG ACATCAGGTA TCTGTTGACA ATGGACAAAC TATGGCGGAA AAGGAAACCT 84 0 
CCAGTTCCGT TGGACTGGGC TGAAGTACAA AGTCAAGGAG AAGAAACGAA TGCATCAGAT 90 0 
CAACAGAATG AACCCCAGTT AGGCCTGAAA GACCAGCAGG TTCTAGATGT AAAGAGCTAT 9 60 
GCACGTCTTT TTTCAAAGAG CATCGAGACT TTGAGAGTTC ATTTAGCAGA AAAGGGGGAT 1020 
GGAGCTGAGC TCATATGGGA TAAGGATGAC CCATCTGCAA TGGATTTTGT CACCTCTGCT 108 0 
GCAAACCTCA GGATGCATAT TTTCAGTATG AAT AT GAAGA GTAGATTTGA TATCAAATCA 114 0 
ATGGCAGGGA ACATTATTCC TGCTATTGCT ACTACTAATG CAGTAATTGC TGGGTTGATA 1200 
GTATTGGAAG GATTGAAGAT TTTATCAGGA AAAAT AG AC C AGTGCAGAAC AATTTTTTTG 12 60 
AATAAACAAC CAAACCCAAG AAAGAAGCTT CTTGTGCCTT GTGCACTGGA TCCTCCCAAC 1320 
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CCCAATTGTT ATGTATGTGC CAGCAAGCCA 
GTGACTGTTC TCACCTTACA AGACAAGATA 
GATGTCCAAA TTGAAGATGG GAAAGGAACA 
GAAGCTAATA AT C AC AAGAA GTTGTCAGAA 
GCAGATGACT TCCTCCAGGA CTATACTTTA 
GGAAAGGACG TTGAATTTGA AGTTGTTGGT 
GCTGAAGATG CTGCCAAAAG CATAACCAAT 
TCCACAGCTC AAGAGCAAGA TGACGTTCTC 
AATAATGCCG ACGTCAGTGA AGAAGAGAGA 
AATCTCAGTG CAAAGAGGTC ACGTATAGAA 
TTAGATTGAA CAGAAATGCC TCTAAACAGA 
AACCAGATTG TTATGTCCTT TGTTCCAAAG 
GATTCTGCTC CCTTTGAAAG CATTCATTTT 
TGTATTGAAA GTAGGAATAT AGTTTTAAAA 
TCATGAGATA AAACAACACA ATGCATGTTG 
TTAATAGTTT CAAAATATTG TGGTTTAGTA 
TTTATTTTTG G C TAG AAGAA GAATTATTTT 
AACTGATTGA AACAGATTCA AAGAAGTATC 
TGTTAGATGG C AC TAT GT AT ATTAATGTAA 
TGTACCGCCT GGTATGTCTG TGTAAGAAGC 
TATTTATATT CGATGTTTTG TAAAACTCAA 
GGCATCTGCA TTCTTGTTAA AAAAAAAAA 



m 

GAGGTGACTG TGCGGCTGAA TGTCCATAAA 1380 
G T G AAAG AAA AATTTGCTAT GGTAGCACCA 14 40 
ATCCTAATAT CTTCCGAAGA GGGAGAGACG 1500 
TTTGGAATTA GAAATGGCAG CCGGCTTCAA 15 60 
TTGATCAACA TCCTTCATAG TGAAGACCTA 1620 
GATGCCCCGG AAAAAGTGGG GCCCAAACAA 1680 
GGCAGTGATG ATGGAGCTCA GCCCTCCACC 17 4 0 
ATAGTTGATT CGGATGAAGA AGATTCTTCA 1800 
AGCCGCAAGA GGAAATTAGA TGAGAAAGAG 18 60 
CAGAAGGAAG AGCTTGATGA TGTCATAGCA 1920 
ACCCTCTTAC TATTTAGTTT ATCTGGGCAG 1980 
GGAAAAAATT GACAGCAGTG ACTTGAAAAT 2 04 0 
GCTAGAACTG TTAGACACAT TGCAGTATGC 2100 
ACCCTTTGAA CAAAGTGTGT GCATAACCAG 2160 
CCTTTTTAAT GTAAATACCC TTAGGTATCA 2220 
AAGTTGATAC CTGGTTATAA ATATTATGCC 228 0 
TAGCCTAGAT CTAACCATTT TCATACTCTT 234 0 
GAGTGCTATG CATTGAAACT TGTTTTTAAA 24 00 
AACAATGTTA AT T T AC T CAA GTTTTCAGTT 24 60 
CAATTTTTGT GTATTGTTAC AGTTTCAGGT 2520 
ATAACGACTA TACTTATGGA CCAAATAAAT 258 0 

2609 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3633 base pairs 

(B) TYPE: nucleic acici 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT13 

(B) CLONE : 3115936 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150 : 
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CCTGAGGGAT CCACAGAGGG TGCGGTCCTT 
GGACCAGCCA GTGGACCCCA TGGCCAGCAA 
TCCCGCCCAC GCAGGTGACT GAGGTGCCAG 
GGGCCCATGC GTCTCACGCT GCCCTTCCTC 
TCCCGAACAA TTCATGGTAA AAACACAAAT 
AAACCTACTT GACAGTGTTG GTGAAAATAG 
TACTGAAAAG TATTTTGTGT TTTTCTCCCA 
GAGAATCCGT CCATGGGAAA CCTTCGGTGG 
TTTATCAGTC CTGGCTAGAC AAGTCCACAC 
TGGGCCTGAG CTTTGTCTAC ATGATTCGAG 
CCTATGCCTT GGGGATCTAC CATCTAAATC 
ATCCTTCCTT AATGGAAGAC TCAGATGACG 
AATTCCGCCC CTTCATTCGA AGGCTCCCAG 
GCATCCTTGT GGCTATGGTC TGTACTTTCT 
CGATTCTGGT GATGTACTTC ATCATGCTCT 
AC AT GAT T AA GTACCGGTAC ATCCCGTTCA 
AGGATGCCGG CAAGGCCTTC GCCAGCTAGA 
AGAACAGTTT TGAGCCATTG TTAACAATGC 
GAGGGAGTCA AATTTTCTTT TTAAAAAGGA 
TTCTAGAAGA AACTGGCGCT TAAACCAAAT 
GTGTTTCTCA CGGATGGAAT TCTAGTCAGC 
ATGGGAGCAA GGGCGAGTGG CCGGTCCCCG 
CTGCGAGGGA GGAACGGGCC GCTCCCCGCC 
CCAGCCACTC CACAGAGCCC GAGGGATGAT 
TTAACGTTTT AAAGGTGATT GTCAAGTAAC 
TCCATCTCAC TGGAGATGTT TAAAGTTGGC 
TAACTACCAT GATTGCTTTT GAGGGCCCGG 
TTGAGATTAT TTTGACACAT TTCTTTGATA 
GTCCTCATGC AACCCTCCAT GAGGGGCAGC 
GGTTCAGCAC TCGGCCCCCC ACTGCGGGAG 
T GAT TC GAGA AAAGAAATAC TCTCAACGTT 



GGAGGGAGGA CATGCAGTGC CACGTGCCAT 60 
GGCTGCTCCT GGGGCCAGTG GGGTGGACAG 120 
TGTGGGAATG AAAATGCGGC CTGTGCTCCT 180 
TCCAGGGAAG CCTGTGTACC TGCTACTTTT 24 0 
GGTATATGGA CAAGATACTG AATGTGGAAG 300 
GGCCAGGATT TCACACCCGT GAATGCTTTT 3 60 
GTTACAGAAT GTCTGAAGGG GACAGTGTGG 420 
TGTACAGATT TTTCACAAGA CTTGGACAGA 4 80 
CCTACACGGC TGTGCGATGG GTCGTGACAC 54 0 
TTTACCTGCT GCAGGGTTGG TACATTGTGA 600 
TTTTCATAGC TTTTCTTTCT CCCAAAGTGG 660 
GTCCTTCGCT ACCCACCAAA CAGAACGAGG 72 0 
AGTTTAAATT TTGGCATGCG GCTACCAAGG 780 
TCGACGCTTT CAACGTCCCG GTGTTCTGGC 84 0 
TCTGTATCAC GATGAAGAGG CAAATCAAGC 900 
C AC AT G G G AA G AG AAGG T AC AGAGGCAAGG 960 
AGCGGGACTG AGGCTGCCTC ACGTGTTGCA 1020 
CTTTTTTCTT CACATAAAGT AGTTGATTAC 108 0 
GCTTCAATGA TTTGTAACTG AAATATCAGG 114 0 
CGCATGGATT TCTTTTTCAG TGACGTTCAA 12 00 
TGCAGGCGGG AAGCCAGGCG GGTGGAGCCC 12 60 
CTGTGCCAGG TGGGCAGGCA GGAGCAAGGC 132 0 
AGCCGCCTTC CCCAGCAGCC GCAGGTGGTG 138 0 
CTAGCCTGAT TCCTGCGTGT CCGAAAGAAC 14 4 0 
TGTGTGGGGT TCTAATGCCA GTTTCCTAAT 1500 
CTCTATCCTA ATGACTCAAA ACTTGGTTCT 15 60 
AATTATAAAT ATATATTATA TTTTAATTGT 1620 
CGTAGAGTGT TTTGTTTTTA ATTTAAATCT 168 0 
GAAGCTGGCA GGGAGCAGAC TGGCTTTGTA 17 4 0 
AGGCGGAACC CACTTGCATG TCAGCGTTTT 18 00 
TTACCAAGTG ATTTTACCTC CACCTTTACT 18 60 
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AAAGTCTTTA CCTAAAACAT GGCAGTCGCT GGACACAGGA AAGCCCACCT TTTGTTTGGC 1920 
CTTTTCGAAA GGTGACCCAT ATTGCACAGC AGAAC AT C AC AGCTGTGGTC CC AG AT GAGA 1980 
CACTGACATG CG AG TGAAGG CCTCTCCTCC TGGGCCCCGG GCTGCGCAGG CTCCTCACTC 20 4 0 
TGGGCGGTGT TTCCTGTCTC AGAATTGACA CGGTGAATGC TTAGTGTCTG GATTTTCTTG 2100 
TGCCAGTGTT TACATATCTG ACATCGAGCT CCTCTAAGAG GCCACGTTCA AGCTTGTGTG 2160 
TCCCTGACCC AAGATAGCCA GTGCTGCTCC CAGGTGGTAC TTCTGGTACC GTGTTGAGAC 2220 
ACTTGGGATT CTCAGACTGT GGACAGGAGT GTTTGTCATT TTTCATACTG TTTTCTTAAT 2280 
AAGCGCTCAG GCCTAAGGTG TGACAGGAAG TCGCACGCGC TTGGCCAGAG CACAGTGAAG 2340 
CAAAGGACTG GGTGCTGATG GATGGAGCCA CGGCGGCATC TGCCCACCCG GCCGCAGCCC 24 00 
CCAGTGCCTC TCCTGGTGGT CCTCCCAGTC TAGAGGGTCA CGGCCCCCCC GCCCTCCTCC 24 60 
GTCTCTGGCA AGCTGACCTT GACTAACCCA GGAATACAGG GTCATCCTCA TTCCTAAGTA 2520 
AGTCAAACAG CAAGACATGG TTTGCGCGGG TCTTTGCCGG AAGCCGGTCC TGCTGGCCAG 25 80 
GTGTTTTACG TCAGCAGGGA AATGTGGCAC ACGCCCTCGA GGCATTTTAA CACTGTGCTT 2 64 0 
CAGGAAATCT CAAGTTCCAT CTTGTGTTAG TAACGTACCC ACATTTTGCT GGAGTTAGTT 2700 
TATTAAAGAT GCCTACGGTG AACTCTCTGG CGCAGGTTAA ATGCAGTTTT GAAAACCTGG 27 60 
AAAC AT C AAA TGGAGGCGGG AAATAGGCTG GGGCCGAGCT GAGGGGCTGA ACACAGCAGT 2 820 
GACCGTGGGT CAGCAGGTCG CCTGCCCAGC AGGCCCCCCA GGAGAGGGCT CGGGCGCCCC 28 8 0 
TGGCAGCCCC CATACCCCCA GGACCTGGCT CGTGAGTGCG TCTGGGTCAG GAAGAGACCT 2 94 0 
CTCTGTGCGT CTCAGGCTGA GAT GC AG ATT TCTGTTTTCT AAAACTGGAA GCGACCTTGA 3000 
CGTGTATTGA AGGTGTGTGT GCCAAATGCT TCCGACGGAG GTGCTGGCCT TGGTTGGTTT 30 60 
CTCTCTGCCC CGTGTGGTCA TCAAGTCCTG GGGGATGTGC TCTGCCCAGC CGCCCTCGGG 3120 
GAGAGCAGCG CCGCCTCCCA TGGGGCCGTG GGGCTGCTGT TCTCACTGCA CTGGCTGAAG 3180 
CAACCCGCCA GCCTCCGTGC CCCACCCCAC CCAGCACGCA CTCATTCAGT CCATTGCCTT 324 0 
AACACAAGCC TGATGGGGCT GTTTTCTCAC AATATAAACG AATAAAGTGT CTTCTGGCCT 3300 
ACTTCTGAAT TACTTCTCAA CTGTATGGTT TGGGGAAGGG AGGGAAACCT AAAATCCCGT 3360 
CCAAATAAGT GAAATTCCTG AAGAAGTGGC TGAGTCCTAC CAGGT TGGGG TTAGGGAAAT 34 20 
GTTCTGGGTT CAGGCGCCCC TCCCAGGGCT GAGAAAGCGC AGCCAGGGAC AGCTTTCTGT 34 80 
TCTCTCCCAG GGTGGCTAGG TTAGTATCTT AC AT GAC AAA AAAC T GAG AG TGTTCTAACT 354 0 
TCTGTGCAAG CAAGGTTAAT CCTGAGACTA AATCTTGGCG TTCAGACTCC CGTAGAGGTC 3 600 
ATCTGTGTCC AGGCCCACCC GGGCGCCGGC TCA 3 633 
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(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2018 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT13 

(B) CLONE: 3116522 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151 : 

TGGCTCGCTG GCCGCTCCTG GAGGCGGCGG CGGGAGCGCA GGGGGCGCGC GGCCCGGGGA 60 

CTCGCATTCC CCGGTTCCCC CTCCACCCCA CGCGGCCTGG ACCATGGACG CCAGATGGTG 120 

GGCAGTGGTG GTGCTGGCTG CGTTCCCCTC CCTAGGGGCA GGTGGGGAGA CTCCCGAAGC 180 

CCCTCCGGAG TCATGGACCC AGCTATGGTT CTTCCGATTT GTGGTGAATG CTGCTGGCTA 24 0 

TGCCAGCTTT ATGGTACCTG GCTACCTCCT GGTGCAGTAC TTCAGGCGGA AGAACTACCT 300 

GGAGACCGGT AGGGGCCTCT GCTTTCCCCT GGTGAAAGCT TGTGTGTTTG GCAATGAGCC 3 60 

CAAGGCCTCT GATGAGGTTC CCCTGGCGCC CCGAACAGAG GCGGCAGAGA CCACCCCGAT 4 20 

GTGGCAGGCC CTGAAGCTGC TCTTCTGTGC CACAGGGCTC CAGGTGTCTT ATCTGACTTG 480 

GGGTGTGCTG CAGGAAAGAG TGATGACCCG CAGCTATGGG GCCACAGCCA CATCACCGGG 54 0 

TGAGCGCTTT ACGGACTCGC AGTTCCTGGT GCTAATGAAC CGAGTGCTGG CACTGATTGT 60 0 

GGCTGGCCTC TCCTGTGTTC TCTGCAAGCA GCCCCGGCAT GGGGCACCCA TGTACCGGTA 6 60 

CTCCTTTGCC AGCCTGTCCA ATGTGCTTAG CAGCTGGTGC CAATACGAAG CTCTTAAGTT 720 

CGTCAGCTTC CCCACCCAGG TGCTGGCCAA GGCCTCTAAG GTGATCCCTG TCATGCTGAT 78 0 

GGGAAAGCTT GTGTCTCGGC GCAGCTACGA AC AC T GGGAG TACCTGACAG CCACCCTCAT 84 0 

CTCCATTGGG GTCAGCATGT TTCTGCTATC CAGCGGACCA GAGCCCCGCA GCTCCCCAGC 900 

CACCACACTC TCAGGCCTCA TCTTACTGGC AGGTTATATT GCTTTTGACA GCTTCACCTC 960 

AAACTGGCAG GATGCCCTGT TTGCCTATAA GATGTCATCG GTGCAGATGA TGTTTGGGGT 102 0 

CAATTTCTTC TCCTGCCTCT TCACAGTGGG CTCACTGCTA GAACAGGGGG CCCTACTGGA 108 0 

GGGAACCCGC TTCATGGGGC GACACAGTGA GTTTGCTGCC CATGCCCTGC TACTCTCCAT 114 0 

CTGCTCCGCA TGTGGCCAGC TCTTCATCTT TTACACCATT GGGCAGTTTG GGGCTGCCGT 1200 

CTTCACCATC ATCATGACCC TCCGCCAGGC CTTTGCCATC CTTCTTTCCT GCCTTCTCTA 12 60 

TGGCCACACT GTCACTGTGG TGGGAGGGCT GGGGGTGGCT GTGGTCTTTG CTGCCCTCCT 1320 

GCTCAGAGTC TACGCGCGGG GCCGTCTAAA GCAACGGGGA AAGAAGGCTG TGCCTGTTGA 138 0 
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GTCTCCTGTG CAGAAGGTTT GAGGGTGGAA 
CTCCCACCAT CCCCTTCTGC TGTAACCTCT 
GTGTTTTCTC AGTATCACAG ACCAGCTCTG 
CCTTCCCTTT TGCCTTAAGT CACCCATCTT 
GTAGACAGTC CTCAGTGAGG GGTTTTGGGG 
CCACAGTTAC TCTTCCCACA AGTTCCCTTA 
CCAGACTCAC TCCCCTCTGC AAATACCTGC 
GGTGTAGGCT CCAATGCTGC TTTCCCAGGA 
GGATGCAGAG CCCTGCCCAG CACCACCACC 
CCATGAGCCT GTTGCAGGTT TTGGTACTTT 
ATTTTATTAA ATTAAATTAC TGCAGTGGAA 



AGGGCCTGAG GGGTGAAGTG AAATAGGACC 14 40 

GAGGGAGCTG GCTGAAAGGG CAAAATGCAG 1500 

CAGCAGGGGA TTGGGGAGCC CAGGAGGCAG 1560 

CCAGTAAGCA GTTTATTCTG AGCCCCGGGG 1620 

AGTTTGGGGT CAAGAGAGCA TAGGTAGGTT 1680 

AGTCTTGCCC TAGCTGTGCT CTGCCACCTT 17 4 0 

ATTTCTTACC CTGGTGAGAA AAGCACAAGC 1800 

GGGTGAAGAT GGTGCTGTGC TGAGGAAAGG 18 60 

TCCTATGCTC CTGGATCCCT AGGCTCTGTT 1920 

AGAAATGTAA CTTTTTGCTC TTATAATTTT 198 0 

AAAAAAAA 2018 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 942 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUTl 3 

(B) CLONE: 3117184 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152 : 
CCTCCATCAG CTCGCCGCGC AGCGGCTGTA TTTGCGGCCT GTGCGAGTAG GCGCTTGGGC 60 
ACTCAGTCTC CCTGGCGGGC GACGGGCAGA AATCTCGAAC CAGTGGAGCG CACTCGTAAC 120 
CTGGATCCCA GAAGGTCGCG AAGGCAGTAC CGTTTCCTCA GCGGCGGACT GCTGCAGTAA 18 0 
GAATGTCTTT TCCACCTCAT TTGAATCGCC CTCCCATGGG AATCCCAGCA CTCCCACCAG 24 0 
GGACCCCACC CCCGCAGTTT CCAGGATTTC CTCCACCTGT ACCTCCAGGG ACCCCAATGA 300 
TTCCTGTACC AATGAGCATT ATGGCTCCTG CTCCGACTGT CTTAGTACCC ACTGTGTCTA 3 60 
TGGTTGGAAA GCATTTGGGC GCAAGAAAGG ATCATCCAGG CTTAAAGGCT AAAGAAAATG 42 0 
AT GAAAAT T G TGGTCCTACT ACCACTGTTT TTGTTGGCAA CATTTCCGAG AAAGCTTCAG 480 
ACATGCTTAT AAGACAACTC TTAGCTAAAT GTGGTTTGGT TTTGAGCTGG AAGAGAGTAC 54 0 
AAGGTGCTTC CGGAAAGCTT CAAGCCTTCG GATTCTGTGA GTACAAGGAG CCAGAATCTA 600 
CCCTCCGTGC ACTCAGATTA TTACATGACC TGCAAATTGG AGAGAAAAAG CTACTCGTTA 6 60 
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AAGTTGATGC AAAGACAAAG GCACAGCTGG 
ATGGGAATGC AAGGCCAGAA ACTGTCACTA 
CAAAGAGGAG AGATCAGATG ATTAAAGGGG 
GTGAGCTAAA TGCCCCCTCA CAGGAATCTG 
AG AAG GAG G A CATTTTCGGC AGATTTCAGT 



ATGAATGGAA AGCAAAGAAG AAAGCTTCTA 720 
AT G AC GAT G A AGAAGCCTTG GAT GAAGAAA 7 80 
CTATTGAAGT TTTAATTCGT GAATACTCCA 8 40 
ATTCTCACCC CAGGAAGAAG AAGAAGGAAA 900 
GGGCCCACTG AT 942 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LNODNOT05 

(B) CLONE: 3125156 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153 : 
TCCCCCCCTC AGCCTCCCCC CCCCCCACTG GCATATGGTC CTGCCCCTTC TACCAGACCC 60 
ATGGGCCCCC AGGCAGCCCC TCTTACCATT CGAGGGCCCT CGTCTGCTGG CCAGTCCACC 12 0 
CCTAGTCCCC ACCTGGTGCC TTCACCTGCC CCATCTCCAG GGCCTGGTCC GGTACCCCCT 18 0 
CGCCCCCCAG CAGCAGAACC ACCCCCTTGC CTGCGCCGAG GCGCCGCAGC TGCAGACCTG 24 0 
CTCTCCTCCA GCCCGGAGAG CCAGCATGGC GGCACTCAGT CTCCTGGGGG TGGGCAGCCC 30 0 
CTGCTGCAGC CCACCAAGGT GGATGCAGCT GAGGGTCGTC GGCCGCAGGC CCTGCGGCTG 360 
ATTGAGCGGG ACCCCTATGA GCATCCTGAG AGGCTGCGGC AGTTGCAGCA GGAGCTGGAG 4 20 
GCCTTTCGGG GTCAGCTGGG GGATGTGGGA GCTCTGGACA CTGTCTGGCG AGAGCTGCAA 4 80 
GATGCGCAGG AACATGATGC CCGAGGCCGT TCCATCGCCA TTGCCCGCTG CTACTCACTG 54 0 
AAGAACCGGC ACCAGGATGT CATGCCCTAT GACAGTAACC GTGTGGTGCT GCGCTCAGGC 600 
AAGGATGACT ACATCAATGC CAGCTGCGTG GAGGGGCTCT CCCCATACTG CCCCCCGCTA 660 
GTGGCAACCC AGGCCCCACT GCCTGGCACA GCTGCTGACT TCTGGCTCAT GGTCCATGAG 720 
CAGAAAGTGT CAGTCATTGT CATGCTGGTT TCTGAGGCTG AGAT GGAGAA GCAAAAAGTG 7 80 
GCACGCTACT TCCCCACCGA GAGGGGCCAG CCCATGGTGC ACGGTGCCCT GAGCCTGGCA 840 
TTGAGCAGCG TCCGCAGCAC CGAAACCCAT GTGGAGCGCG TGCTGAGCCT GCAGTTCCGA 900 
GACCAGAGCC TCAAGCGCTC TCTTGTGCAC CTGCACTTCC CCACTTGGCC TGAGTTAGGC 960 
CTGCCCGACA GCCCCAGCAA CTTGCTGCGC TTCATCCAGG AGGTGCACGC AC AT TACCTG 102 0 
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CATCAGCGGC CGCTGCACAC GCCCATCATT 

GGAGCCTTTG CACTGCTCTA TGCAGCTGTG 

GAGCTGCCTC AGCTGGTGCG GCGCATGCGG 

CTGCACCTCA GGTTCTGCTA TGAGGCAGTG 

CATGGTGTGC CTCCTCCATG CAAACCCTTG 

CTTCCTCAGG ACTCCCAGGA CCTGGTCCTC 

GCCACCATTG CCAAGCTCAG CATTCGGCCT 

TTGCCAGGCC CTGCAGAGCC CCCAGGCCTC 

ATCCCATCTT CCTCCCAAAC CCCCTTTCCT 

AGGAGCCGCC AGTGCCTGAA GCCCCCAGCT 

TGGCCTCCTT GACCCCAGAG GCCTTCTCCC 

TGAGCAAGCA TAACTTTCTG CAGGCCCATA 

CTGACGACCC CCTCAGCCTT CTGGATCCAC 

TTGCCTACCT GGTCCTTACA CT AC AT CATC 

CAGAGCTTCT CAGTGGGCAC AGTCTCTTAC 

GCCCAGCCTG CACCCCTGTG GGGTGGAAAT 

CTTTATGGGA CCCGACATTT TTCAGCTCTT 
GTGAAAAAAA AAAAAAAAAG 



GTGCACTGCA GCTCTGGTGT GGGCCGCACG 1080 
CAGGAGGTGG AGGCTGGGAA CGGAATCCCT 1140 
C AG C AG AG AA AGCACATGCT GCAGGAGAAG 1200 
GTGAGACACG TGGAGCAGGT CCTGCAGCGC 12 60 
GCCAGTGCAA GCATCAGCCA GAAGAACCAC 1320 
GGTGGGGATG TGCCCATCAG CTCCATCCAG 13 80 
CCTGGGGGGT TGGAGTCCCC GGTTGCCAGC 14 4 0 
CCGCCAGCCA GCCTCCCAGA GTCTACCCCA 1500 
CCCCACTACC TGAGGCTCCC CAGCCTAAGG 15 60 
CGGGGCCCCC CTCCTCCTCC CTGGAATTGC 1620 
TGGACAGCTC CCTGCGGGGC AAACAGCGGA 1680 
ACGGGCAAGG GCTGCGGGCC ACCCGGCCCT 17 4 0 
TCTGGACACT CAACAAGACC TGAACAGGTT 18 00 
AT CAT CT CAT GCCCACCTGC CCACACCCAG 18 60 
TCCCATTTCT GCTGCCTTTG GCCCTGCCTG 192 0 
GTACTGCAGG CTCTGGGTCA GGTTCTGCTC 198 0 
TGCTATTGAA ATAATAAACC ACCCTGTTCT 2040 

2060 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2065 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LUNGTUT12 

(B) CLONE: 3129120 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154 : 
CGGGTCCCCG GGTCTGACAG GAGCAGCCTG TGGGCACCGC GGCGGTAGTT GGAGGCGGGA 60 
GAGGGTCCGT AGCCGCGCCG CCCTGCCCCG CCATGGGCCT CCTGTCGGAC CCGGTTCGCC 120 
GGCGCGCGCT CGCCCGCCTA GTGCTGCGCC TCAACGCGCC GTTGTGCGTG CTGAGCTACG 180 
TGGCGGGCAT CGCCTGGTTC TTGGCGCTGG TTTTCCCGCC GCTGACCCAG CGCACTTACA 24 0 
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TGTCGGAGAA CGCCATGGGC TCCACCATGG TGGAGGAGCA GTTTGCGGGC GGAGACCGTG 300 

CCCGGGCTTT TGCCCGGGAC TTCGCCGCCC ACCGCAAGAA GTCGGGGGCT CTGCCAGTGG 360 

CCTGGCTTGA ACGGACGATG CGGTCAGTAG GGCTGGAGGT CTACACGCAG AGTTTCTCCC 4 20 

GGAAACTGCC CTTCCCAGAT GAGACCCACG AGCGCTATAT GGTGTCGGGC ACCAACGTGT 4 80 

ACGGCATCCT GCGGGCCCCG CGTGCTGCCA GCACCGAGTC GCTTGTGCTC ACCGTGCCCT 54 0 

GTGGCTCTGA CTCTACCAAC AGCCAGGCTG TGGGGCTGCT GCTGGCACTG GCTGCCCACT 60 0 

TCCGGGGGCA GATTTATTGG GCCAAAGATA TCGTCTTCCT GGTAACAGAA CATGACCTTC 6 60 

TGGGCACTGA GGCTTGGCTT GAAGCCTACC ACGATGTCAA TGTCACTGGC ATGCAGTCGT 72 0 

CTCCCCTGCA GGGCCGAGCT GGGGCCATTC AGGCAGCCGT GGCCCTGGAG CTGAGCAGTG 780 

ATGTGGTCAC CAGCCTCGAT GTGGCCGTGG AGGGGCTTAA CGGGCAGCTG CCCAACCTTG 840 

ACCTGCTCAA TCTCTTCCAG ACCTTCTGCC AGAAAGGGGG CCTGTTGTGC ACGCTTCAGG 900 

GCAAGCTGCA GCCCGAGGAC TGGACATCAT TGGATGGACC GCTGCAGGGC CTGCAGACAC 960 

TGCTGCTCAT GGTTCTGCGG CAGGCCTCCG GCCGCCCCCA CGGCTCCCAT GGCCTCTTCC 1020 

TGCGCTACCG TGTGGAGGCC CTAACCCTGC GTGGCATCAA TAGCTTCCGC CAGTACAAGT 108 0 

ATGACCTGGT GGCAGTGGGC AAGGCTTTGG AGGGCATGTT CCGCAAGCTC AACCACCTCC 114 0 

TGGAGCGCCT GCACCAGTCC TTCTTCCTCT ACTTGCTCCC CGGCCTCTCC CGCTTCGTCT 1200 

CCATCGGCCT CTACATGCCC GCTGTCGGCT TCTTGCTCCT GGTCCTTGGT CTCAAGGCTC 12 60 

TGGAACTGTG GATGCAGCTG CAT GAGGCTG GAATGGGCCT TGAGGAGCCC GGGGGTGCCC 1320 

CTGGCCCCAG TGTACCCCTT CCCCCATCAC AGGGTGTGGG GCTGGCCTCG CTCGTGGCAC 1380 

CTCTGCTGAT CTCACAGGCC ATGGGACTGG CCCTCTATGT CCTGCCAGTG CTGGGCCAAC 14 4 0 

ACGTTGCCAC CCAGCACTTC CCAGTGGCAG AGGCTGAGGC TGTGGTGCTG ACACTGCTGG 1500 

CGATTTATGC AGCTGGCCTG GCCCTGCCCC ACAATACCCA CCGGGTGGTA AG C AC AC AG G 15 60 

CCCCAGACAG GGGCTGGATG GCACTGAAGC TGGTAGCCCT GATCTACCTA GCACTGCAGC 1620 

TGGGCTGCAT CGCCCTCACC AACTTCTCAC TGGGCTTCCT GCTGGCCACC ACCATGGTGC 1680 

CCACTGCTGC GCTTGCCAAG CCTCATGGGC CCCGGACCCT CTATGCTGCC CTGCTGGTGC 17 4 0 

TGACCAGCCC GGCAGCCACG CTCCTTGGCA GCCTGTTCCT GTGGCGGGAG CTGCAGGAGG 1800 

CGCCACTGTC ACTGGCCGAG GGCTGGCAGC TCTTCCTGGC AGCGCTAGCC CAGGGTGTGC 18 60 

TGGAGCACCA CACCTACGGC GCCCTGCTCT TCCCACTGCT GTCCCTGGGC CTCTACCCCT 1920 

GCTGGCTGCT TTTCTGGAAT GTGCTCTTCT GGAAGTGAGA TCTGCCTGTC CGGGCTGGGA 1980 

CAGAGACTCC CCAAGGACCC CATTCTGCCT CCTTCTGGGG AAATAAATGA GTGTCTGTTT 20 40 

CAGCAGCTAT TTGATGCTTG TCACA 2065 
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What is claimed is: 

1 . A substantially purified human signal peptide-containing protein (SIGP) 
comprising a polypeptide having an amino acid sequence selected from the group consisting 
of SEQ ID NO:l ? SEQ ID NO:2 ? SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 

5 NO:6 ? SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO:l 1, SEQ 
ID NO:12, SEQ IDNO:13, SEQ IDNO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO: 17, 
SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID 
NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27 ? SEQ ID NO:28, 
SEQ ID NO:29, SEQ ID NO:30 ? SEQ ID NO:31 ? SEQ ID NO:32, SEQ ID NO:33, SEQ ID 

10 NO:34, SEQ ID NO:35 ? SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, 
SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID 
NO:45 ? SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49 ? SEQ ID NO:50 ? 
SEQ ID NO:51, SEQ ID NO:52 ? SEQ ID NO:53 ? SEQ ID NO:54, SEQ ID NO:55 ? SEQ ID 
NO:56, SEQ ID NO:57, SEQ ID NO:58 ? SEQ ID NO:59, SEQ ID NO:60 ? SEQ ID NO:61, 

15 SEQ ID NO:62, SEQ ID NO:63 ? SEQ ID NO:64, SEQ ID NO:65 5 SEQ ID NO:66 ? SEQ ID 
NO:67, SEQ ID NO:68, SEQ ID NO:69 ? SEQ ID NO:70, SEQ ID NO:71 3 SEQ ID NO:72 5 
SEQ ID NO:73 3 SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, and SEQ ID NO:77. 

2. An isolated and purified polynucleotide which hybridizes under stringent 
conditions to the polynucleotide encoding an SIGP of claim 1 . 

20 3. An isolated and purified polynucleotide encoding the SIGP of claim 1 . 

4. A microarray containing at least a fragment of at least one of the 
polynucleotides encoding an SIGP of claim 1. 

5. An isolated and purified polynucleotide variant having at least 90% 
polynucleotide identity to the polynucleotide of claim 3. 

25 6. A composition comprising the polynucleotide of claim 3. 

7. An isolated and purified polynucleotide which hybridizes under stringent 
conditions to the polynucleotide of claim 3. 

8. An isolated and purified polynucleotide which is complementary to the 
polynucleotide of claim 3. 

30 9. An isolated and purified polynucleotide having a nucleic acid sequence 
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selected from the group consisting of SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ 
ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, SEQ ID NO:84, 
SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID 
NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, 
5 SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO: 100, SEQ ID 
NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID 
NO: 106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO:l 10, SEQ ID 
NO:l 1 1, SEQ ID NO:l 12, SEQ ID NO:l 13, SEQ ID NO:l 14, SEQ ID NO:l 15, SEQ ID 
NO:l 16, SEQ ID NO:l 17, SEQ ID NO:l 18, SEQ ID NO:l 19, SEQ ID NO:120, SEQ ID 

10 NO:121, SEQ ID NO:122, SEQ ID NO:123, SEQ ID NO:124, SEQ ID NO:125, SEQ ID 
NO:126, SEQ ID NO:127, SEQ ID NO:128, SEQ ID NO:129, SEQ ID NO:130, SEQ ID 
NO:131, SEQ ID NO:132, SEQ ID NO:133, SEQ ID NO:134, SEQ ID NO:135, SEQ ID 
NO: 136, SEQ ID NO: 137, SEQ ID NO: 138, SEQ ID NO: 139, SEQ ID NO: 140, SEQ ID 
NO:141, SEQ ID NO:142, SEQ ID NO:143, SEQ ID NO:144, SEQ ID NO:145, SEQ ID 

15 NO:146, SEQ ID NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID NO:150, SEQ ID 
NO:151, SEQ ID NO: 152, SEQ ID NO: 153, and SEQ ID NO: 154. 

1 0. An isolated and purified polynucleotide variant having at least 90% 
polynucleotide identity to the polynucleotide of claim 9. 

1 1 . An isolated and purified polynucleotide which is complementary to the 
20 polynucleotide sequence of claim 9. 

12. An expression vector containing at least a fragment of the polynucleotide of 
claim 3. 

13. A host cell containing the expression vector of claim 12. 

14. A method for producing a polypeptide encoding a human signal peptide- 
25 containing protein, the method comprising the steps of: 

(a) culturing the host cell of claim 13 under conditions suitable for the 
expression of the polypeptide; and 

(b) recovering the polypeptide from the host cell culture. 

15. A pharmaceutical composition comprising the SIGP of claim 1 in conjunction 
30 with a suitable pharmaceutical carrier. 
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16. A purified antibody which specifically binds to the SIGP of claim 1 . 

17. A purified agonist of the SIGP of claim 1 . 

18. A purified antagonist of the SIGP of claim 1 . 

19. A method for treating or preventing a cancer, the method comprising 
administering to a subject in need of such treatment an effective amount of the 
pharmaceutical composition of claim 15. 

20. A method for treating or preventing a cancer, the method comprising 
administering to a subject in need of such treatment an effective amount of the antagonist of 
claim 18. 

21. A method for treating or preventing an immune response, the method 
comprising administering to a subject in need of such treatment an effective amount of the 
antagonist of claim 18. 

22. A method for detecting a polynucleotide encoding a human signal peptide- 
containing protein in a biological sample containing nucleic acids, the method comprising the 
steps of: 

(a) hybridizing the polynucleotide of claim 8 to at least one of the nucleic 
acids of the biological sample, thereby forming a hybridization complex; and 

(b) detecting the hybridization complex, wherein the presence of the 
hybridization complex correlates with the presence of a polynucleotide encoding 
SIGP in the biological sample. 

23. The method of claim 22 wherein the nucleic acids of the biological sample are 
amplified by the polymerase chain reaction prior to the hybridizing step. 
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ABSTRACT OF THE DISCLOSURE 

The invention provides a human signal peptide-containing proteins (SIGP) and 
polynucleotides which identify and encode SIGP. The invention also provides expression 
vectors, host cells, antibodies, agonists, and antagonists. The invention also provides 
methods for treating or preventing disorders associated with expression of SIGP. 
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